Getting rid of a site in Google
-
Hi,
I have two sites, lets call them site A and site B, both are sub domains of the same root domain. Because of a server config error, both got indexed by Google.
Google reports millions of inbound links from Site B to Site A
I want to get rid of Site B, because its duplicate content.
First I tried to remove the site from webmaster tools, and blocking all content in the robots.txt for site B, this removed all content from the search results, but the links from site B to site A still stayed in place, and increased (even after 2 months)
I also tried to change all the pages on Site B to 404 pages, but this did not work either
I then removed the blocks, cleaned up the robots.txt and changed the server config on Site B so that everything redirects (301) to a landing page for Site B. But still the links in Webmaster Tools to site A from Site B is on the increase.
What do you think is the best way to delete a site from google and to delete all the links it had to other sites so that there is NO history of this site? It seems that when you block it with robots.txt, the links and juice does not disappear, but only the blocked by robots.txt report on WMT increases
Any suggestions?
-
The sites are massive and we are talking massive numbers:
Google reports in WMT that site B still has 259,157,970 links to site A, although when you filter into the report it only shows a few
The current state is that nothing is blocked on Site B, and ALL pages point to the landing page of Site B.
In WMT for site B, G still shows data for all the reports, like search queries, keywords, crawl errors (very old and all fixed) and so on. The reports and data does not bother me as much as the 259,157,970 links it reports on Site A.
On the 11th of April when I started the process of getting rid of these links, there were 554,066,716, this jumped up to 603,404,378 on the 28th of April. It started dropping and was as low as 122,405,100 on the 17th of May, and then started growing again up to where it is now 259,157,970
I also noticed that when the pages was giving 404s that the crawl rate of google dropped to zero, now that its redirecting to the landing page, the crawl rate is back up to about 1,800 per day, which is still very low, considering the numbers we are talking about.
The crawl rate on Site A is okay, at 220,000 per day, but it was as high as 800,000 per day at one stage.
-
If you remove all history of a website it may still appear in the wayback machine.
If you first blocked robots then they wont create the 301 links, they'll just keep the previously cached pages? Maybe remove the robots.txt and let google index every page with the 301 to the landing page, then after they've indexed add the robot.txt back. Have you tried submitting a new sitemap in Webmaster tools pointing all pages at the landing page?
Roughly how many pages are in your website?
-
I failed to mention that both sites A and B had the exact same content, database and URL structure, with the only difference being the sub domain.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Docs
Hi Mozers, I was wondering what do you guys think about indexing Google Docs files as Documents or Spreadsheets? Can you do that and is it any help if you what to get some content on the firs page of Google. And also can Google see that content and links, because when I deactivate the javascript on chrome I couldn't see anything from the content Thanks
Intermediate & Advanced SEO | | VeeamSoftware0 -
What is Google supposed to return when you submit an image URL into Fetch as Google? Is a few lines of readable text followed by lots of unreadable text normal?
I am seeing something like this (Is this normal?): HTTP/1.1 200 OK
Intermediate & Advanced SEO | | Autoboof
Server: nginx
Content-Type: image/jpeg
X-Content-Type-Options: nosniff
Last-Modified: Fri, 13 Nov 2015 15:23:04 GMT
Cache-Control: max-age=1209600
Expires: Fri, 27 Nov 2015 15:23:55 GMT
X-Request-ID: v-8dd8519e-8a1a-11e5-a595-12313d18b975
X-AH-Environment: prod
Content-Length: 25505
Accept-Ranges: bytes
Date: Fri, 13 Nov 2015 15:24:11 GMT
X-Varnish: 863978362 863966195
Age: 16
Via: 1.1 varnish
Connection: keep-alive
X-Cache: HIT
X-Cache-Hits: 1 ����•JFIF••••��;CREATOR: gd-jpeg v1.0 (using IJG JPEG v80), quality = 75
��C•••••••••• •
••
••••••••• $.' ",#(7),01444'9=82<.342��C• ••••
•2!!22222222222222222222222222222222222222222222222222��•••••v••"••••••��••••••••••••••••
•���•••••••••••••}•••••••!1A••Qa•"q•2���•#B��•R��$3br�
••••%&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz���������������������������������������������������������������������������•••••••••••••••••••
•���••••••••••••••w••••••!1••AQ•aq•"2�••B���� #3R�•br�0 -
Does Google index more than three levels down if the XML sitemap is submitted via Google webmaster Tools?
We are building a very big ecommerce site. The site has 1000 products and has many categories/levels. The site is still in construccion so you cannot see it online. My objective is to get Google to rank the products (level 5) Here is an example level 1 - Homepage - http://vulcano.moldear.com.ar/ Level 2 - http://vulcano.moldear.com.ar/piscinas/ Level 3 - http://vulcano.moldear.com.ar/piscinas/electrobombas-para-piscinas/ Level 4 - http://vulcano.moldear.com.ar/piscinas/electrobombas-para-piscinas/autocebantes.html/ Level 5 - Product is on this level - http://vulcano.moldear.com.ar/piscinas/electrobombas-para-piscinas/autocebantes/autocebante-recomendada-para-filtros-vc-10.html Thanks
Intermediate & Advanced SEO | | Carla_Dawson0 -
Switching from Google Plus Local to Google Plus Business
Greetings, We have a website design firm located in India. We wanted to target customers in our city who are looking for website design locally. And with google plus local and a few content marketing would get us into first page very soon because none in the competition is using social signals or even content marketing. BUT unfortunately from last month even though our Google Places is verified we cant verify our Google Local Plus page https://plus.google.com/b/116513400635428782065/ It just shows error 500. Its a bug and its been a year for people without it being addressed. So we are skeptical if our strategy would work without Google+. At the least we decided we would just make company local page and connect it with website. But it might not have effect as local. So we are still unsure which step to take either to wait for google to fix it.(feedbacks emails calls nothing worked) OR We start the process with Google Business Category.
Intermediate & Advanced SEO | | hard0 -
Can links indexed by google "link:" be bad? or this is like a good example by google
Can links indexed by google "link:" be bad? Or this is like a good example shown by google. We are cleaning our links from Penguin and dont know what to do with these ones. Some of them does not look quality.
Intermediate & Advanced SEO | | bele0 -
How do I create a strategy to get rid of dupe content pages but still keep the SEO juice?
We have about 30,000 pages that are variations of "<product-type>prices/<type-of-thing>/<city><state "<="" p=""></state></city></type-of-thing></product-type> These pages are bringing us lots of free conversions because when somebody searches for this exact phrase for their city/state, they are pretty low-funnel. The problem that we are running into is that the pages are showing up as dupe content. One solution we were discussing is to 301-redirect or canonical all the city-state pages back to jus tthe "<type of="" thing="">" level, and then create really solid unique content for the few hundred pages we would have at that point.</type> My concern is this. I still want to rank for the city-state because as I look through our best-converting search-terms, they nearly always have the city-state in the search term, so the search is some variation of " <product-type><type of="" thing=""><city><state>"</state></city></type></product-type> One thing we thought about doing is dynamically changing the meta-data & headers to add the city-state info there. Are there other potential solutions to this?
Intermediate & Advanced SEO | | editabletext0 -
Splitting a Site into Two Sites for SEO Purposes
I have a client that owns a business that really could be easily divided into two separate business in terms of SEO. Right now his web site covers both divisions of his business. He gets about 5500 visitors a month. The majority go to one part of his business and around 600 each month go to the other. So about 11% I'm considering breaking off this 11% and putting it on an entirely different domain name. I think I could rank better for this 11%. The site would only be SEO'd for this particular division of the company. The keywords would not be in competition with each other. I would of course link the two web sites and watch that I don't run into any duplicate content issues. I worry about placing the redirects from the pages that I remove to the new pages. I know Google is not a fan of redirects. Then I also worry about the eventual drop in traffic to the main site now. How big of a factor is traffic in rankings? Other challenges include that the business services 4 major metropolitan areas. Would you do this? Have you done this? How did it work? Any suggestions?
Intermediate & Advanced SEO | | MSWD0 -
I have a .com site but I am only ranking good on google for Canada and not the USA.
We are located in Canada but sell our products world wide. We are ranking ok on google.ca but are not in the top 50 on google.com. Is it due to my ip address? Is there any tips that you can give me to help up my rating for google.com. Any info you can provide me with will be amazing. Thanks,
Intermediate & Advanced SEO | | drewzal0