Getting rid of a site in Google
-
Hi,
I have two sites, lets call them site A and site B, both are sub domains of the same root domain. Because of a server config error, both got indexed by Google.
Google reports millions of inbound links from Site B to Site A
I want to get rid of Site B, because its duplicate content.
First I tried to remove the site from webmaster tools, and blocking all content in the robots.txt for site B, this removed all content from the search results, but the links from site B to site A still stayed in place, and increased (even after 2 months)
I also tried to change all the pages on Site B to 404 pages, but this did not work either
I then removed the blocks, cleaned up the robots.txt and changed the server config on Site B so that everything redirects (301) to a landing page for Site B. But still the links in Webmaster Tools to site A from Site B is on the increase.
What do you think is the best way to delete a site from google and to delete all the links it had to other sites so that there is NO history of this site? It seems that when you block it with robots.txt, the links and juice does not disappear, but only the blocked by robots.txt report on WMT increases
Any suggestions?
-
The sites are massive and we are talking massive numbers:
Google reports in WMT that site B still has 259,157,970 links to site A, although when you filter into the report it only shows a few
The current state is that nothing is blocked on Site B, and ALL pages point to the landing page of Site B.
In WMT for site B, G still shows data for all the reports, like search queries, keywords, crawl errors (very old and all fixed) and so on. The reports and data does not bother me as much as the 259,157,970 links it reports on Site A.
On the 11th of April when I started the process of getting rid of these links, there were 554,066,716, this jumped up to 603,404,378 on the 28th of April. It started dropping and was as low as 122,405,100 on the 17th of May, and then started growing again up to where it is now 259,157,970
I also noticed that when the pages was giving 404s that the crawl rate of google dropped to zero, now that its redirecting to the landing page, the crawl rate is back up to about 1,800 per day, which is still very low, considering the numbers we are talking about.
The crawl rate on Site A is okay, at 220,000 per day, but it was as high as 800,000 per day at one stage.
-
If you remove all history of a website it may still appear in the wayback machine.
If you first blocked robots then they wont create the 301 links, they'll just keep the previously cached pages? Maybe remove the robots.txt and let google index every page with the 301 to the landing page, then after they've indexed add the robot.txt back. Have you tried submitting a new sitemap in Webmaster tools pointing all pages at the landing page?
Roughly how many pages are in your website?
-
I failed to mention that both sites A and B had the exact same content, database and URL structure, with the only difference being the sub domain.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Is Indexing my 301 Redirects to Other sites
Long story but now i have a few links from my site 301 redirecting to youtube videos or eCommerce stores. They carry a considerable amount of traffic that i benefit from so i can't take them down, and that traffic is people from other websites, so basically i have backlinks from places that i don't own, to my redirect urls (Ex. http://example.com/redirect) My problem is that google is indexing them and doesn't let them go, i have tried blocking that url from robots.txt but google is still indexing it uncrawled, i have also tried allowing google to crawl it and adding noindex from robots.txt, i have tried removing it from GWT but it pops back again after a few days. Any ideas? Thanks!
Intermediate & Advanced SEO | | cuarto7150 -
Fetch as Google -- Does not result in pages getting indexed
I run a exotic pet website which currently has several types of species of reptiles. It has done well in SERP for the first couple of types of reptiles, but I am continuing to add new species and for each of these comes the task of getting ranked and I need to figure out the best process. We just released our 4th species, "reticulated pythons", about 2 weeks ago, and I made these pages public and in Webmaster tools did a "Fetch as Google" and index page and child pages for this page: http://www.morphmarket.com/c/reptiles/pythons/reticulated-pythons/index While Google immediately indexed the index page, it did not really index the couple of dozen pages linked from this page despite me checking the option to crawl child pages. I know this by two ways: first, in Google Webmaster Tools, if I look at Search Analytics and Pages filtered by "retic", there are only 2 listed. This at least tells me it's not showing these pages to users. More directly though, if I look at Google search for "site:morphmarket.com/c/reptiles/pythons/reticulated-pythons" there are only 7 pages indexed. More details -- I've tested at least one of these URLs with the robot checker and they are not blocked. The canonical values look right. I have not monkeyed really with Crawl URL Parameters. I do NOT have these pages listed in my sitemap, but in my experience Google didn't care a lot about that -- I previously had about 100 pages there and google didn't index some of them for more than 1 year. Google has indexed "105k" pages from my site so it is very happy to do so, apparently just not the ones I want (this large value is due to permutations of search parameters, something I think I've since improved with canonical, robots, etc). I may have some nofollow links to the same URLs but NOT on this page, so assuming nofollow has only local effects, this shouldn't matter. Any advice on what could be going wrong here. I really want Google to index the top couple of links on this page (home, index, stores, calculator) as well as the couple dozen gene/tag links below.
Intermediate & Advanced SEO | | jplehmann0 -
Merging B2B site with B2C site
Hi, A mobile phone accessory client of ours has a retail site (B2C) and a trade site (B2B). The retail site does pretty well and ranks highly for a number of terms. The trade site doesn't really rank for anything as they don't optimise it. They would like to merge the two sites and allow trade customers to log-in and purchase goods in bulk for their business. If they were to merge the trade site into the already successful consumer site, what would be the best way of doing this and what, if any, implications would it have on the organic visibility of the B2C site? Would it be possible to target retail and trade customers on one website? Cheers, Lewis
Intermediate & Advanced SEO | | PeaSoupDigital0 -
Is my site being penalized?
I've gone through all the points on https://moz.com/blog/technical-site-audit-for-2015 but the site only ranks for its brand name after months. The website is not ranking in the top 100 for any main keywords (2,3,4 word phrases), only for a handful of very long phrases (4+). All of the content is unique, all pages are indexed, the website is fast and doesn't contain any crawl errors and there are a couple of links pointing to it. There is a sitewide follow link in the footer pointing to another domain, its parent company and vice-versa. This is not done for any SEO reasons but the companies are related and also the products are supplementary of each other. Could this be an issue? Or is my site being penalized by something else?
Intermediate & Advanced SEO | | Robbern0 -
Google penalty or what???
Hi, we have a blog site xxxxxxxxxxx.es, that yesterday dissapear from google ranks all of a sudden it only appears if you write xxxxxxxxx.es I have checked gogle webmaster tools and there are no manual actions, no messages. Also, we don't have much links pointing to this site. Webmaster tools show only 319 links. We don't understand what have happenned. Never see something similar. What do you think? Any help would be appreciated. How do you proceed in this cases? It doesn't seem to be a link problem. How do you know what kind of penalty do you have? Thank you. Update: Hi, the domain is www.crearcorreoelectronico.es I have check the majestic seo, ose, and wmt and get the links. We have some links that are not good, but are automatic ones, that some portals generate. Maybe is something related with the content. I don't know Thanks
Intermediate & Advanced SEO | | teconsite1 -
Is it Wortwhile to have a HTML site map for a Large Site
We are a large, enterprise site with many pages (some on our CMS and some old pages that exist outside our CMS). Every month we submit various an XML site map. Some pages on our site can no longer be found via following links from one page to another (orphan pages). Some of those pages are important and some not. Is it worth our while to create a HTML site map? Does any one have any recent stats or blog posts to share, showing how a HTML site map may have benefited a large site. Many thanks
Intermediate & Advanced SEO | | CeeC-Blogger0 -
Is it safe to not have a sitemap if Google is already crawling my site every 5-10 min?
I work on a large news site that is constantly being crawled by Google. Googlebot is hitting the homepage every 5-10 minutes. We are in the process of moving to a new CMS which has left our sitemap nonfunctional. Since we are getting crawled so often, I've met resistance from an overwhelmed development team that does not see creating sitemaps as a priority. My question is, are they right? What are some reasons that I can give to support my claim that creating an xml sitemap will improve crawl efficiency and indexing if we are already having new stories appear in Google SERPs within 10-15 minutes of publication? Is there a way to quantify what the difference would be if we added a sitemap?
Intermediate & Advanced SEO | | BostonWright0 -
High ranked web site on Google GONE - but webspam team says nothing wrong
We purchased several weeks ago a .org blog that has been highly ranked (number 1 on competetive keywords) for at least a year. it is a blog We moved the blog to our IP range and it went from #1 on top keyword and first page on another to the home page just gone. Now there was a secondary page indexed that stayed on page 5 for the keyword the home page was ranked #1 but the home page (which was high ranked page is just gone) We wrote the Google Webmaster team for reconsideration but they wrote back and said the web spam team said nothing wrong. A contact of mine who works for one of the most well known SEO compaines in the world says because we moved it the site could disappear for a week or so but the "algos would realize" and return it to that top spot soon. Does anyone know anything about moving a site to new IP and issues that can result?
Intermediate & Advanced SEO | | TBKO0