Tens of duplicate homepages indexed and blocked later: How to remove from Google cache?
-
Hi community,
Due to some WP plugin issue, many homepages indexed in Google with anonymous URLs. We blocked them later. Still they are in SERP. I wonder whether these are causing some trouble to our website, especially as our exact homepages indexed. How to remove these pages from Google cache? Is that the right approach?
Thanks
-
Hi Nigel,
Thanks for the suggestion. I'm going to use "Remove URLs" tool from GSC. They have been created due to a bug in the Yoast SEO plugin. Very unfortunate and we paid for no mistake from our end.
Removing from SERP means removing from Google index also? Or Google will still consider them and just stops showing us? My intention is: Anyway we blocked them, but whether they will cause some distraction to our ranking efforts being there in results being cached.
Thanks
-
Thanks!
A agree - I have just done a similar clean up by:
1. Don't let them be created
2. Redirect all previous versions!One site I just worked on had 8 versions of the home page! lol
http
https
/index.php
/index.php/A mess!
We stopped them all being created and 301'd all versions just in case they were indexed anywhere or linked externally.
Cheers
-
It is assuredly true that, just like in any number of fields (medicine) - in SEO, prevention is better than cleanup based methodology. If your website doesn't take its medicine, you get problems like this one
I think your advice here was really good
-
Good solid advice
They can be created in any number of ways but it's normally simple enough to specify the preferred URL on the server then move any variations in htaccess, such as those with www (if the none www is preferred), those with a trailing slash at the end etc.
The self canonical on all will sort out any other duplicates.
As for getting rid of them - the search console way is the quickest. If they don't exist after that then the won't be reindexed unless they are linked from somewhere else. In such cases, they will 301 from htaccess so it shouldn't be a problem.
if you 410 you will lose any benefit from those links going to the pages and it's a bad experience for a visitor. Always 301 do not 410 if it is a version.
410s are fine for old pages you never want to see in the index again but not for a home page version.
Regards
Nigel
-
It's likely that you don't have access to edit the coding on these weird plugin URLs. As such, normal techniques like using a Meta no-index tag in the HTML may be non-viable.
You could use the HTTP header (server level stuff) to help you out. I'd advise adding two strong directives to the afflicted URLs through the HTTP header so that Google gets the message:
-
Use the X-Robots deployment of the no-index directive on the affected URLs, at the HTTP header (not the HTML) level. That linked pages tells you about the normal HTML implementation, but also about the X-Robots implementation which is the one you need (scroll down a bit)
-
Serve status code 410 (gone) on the affected URLs
That should prompt Google to de-index those pages. Once they are de-indexed, you can use robots.txt to block Google from crawling such URLs in the future (which will stop the problem happening again!)
It's important to de-index the URLs before you do any robots.txt stuff. If Google can't crawl the affected URLs, it can't find the info (in the HTTP header) to know that it should de-index those pages
Once Google is blocked from both indexing and crawling these pages, they should begin to stop caching them too
Hope that helps
-
-
+1 for "Make sure that they are not created in the first place" haha
-
Hi again vtmoz!
1. Make sure that they are not created in the first place
2. Make sure that they are not in the sitemap
3. Go to search console and remove any you do not want - it will say temporary removal but they will not come back if they are not in the structure or the sitemap.More:
https://support.google.com/webmasters/answer/1663419?hl=en
Note: Always self canonicalize the home page to stop versions with UTM codes (created by Facebook, Twitter etc) appearing in SERPS
Regards
Nigel
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need only tens of pages to be indexed out of hundreds: Robots.txt is Okay for Google to proceed with?
Hi all, We 2 sub domains with hundreds of pages where we need only 50 pages to get indexed which are important. Unfortunately the CMS of these sub domains is very old and not supporting "noindex" tag to be deployed on page level. So we are planning to block the entire sites from robots.txt and allow the 50 pages needed. But we are not sure if this is the right approach as Google been suggesting to depend mostly on "noindex" than robots.txt. Please suggest whether we can proceed with robots.txt file. Thanks
Algorithm Updates | | vtmoz0 -
Google & Tabbed Content
Hi I wondered if anyone had a case study or more info on how Google treats content under tabs? We have an ecommerce site & I know it is common to put product content under tabs, but will Google ignore this? Becky
Algorithm Updates | | BeckyKey1 -
Google Algorithm change this month - theories ?
Hi Moz fans! If you've not had the chance to check out moz cast go check it out it seems Google has been busy and so much so it broke Moz cast. There has been some discussion on seo round table about the changes and something seems to be going on. I wanted to find out how you guys are all finding it have you had a change in rankings? Any theories on what you think Google is up to? Personally I've seen rather a few of my sites go up in the rankings last week or so. As always look forward to hearing your thoughts and feelings on it Moz.
Algorithm Updates | | GPainter3 -
Am I doing enough to rid duplicate content?
I'm in the middle of a massive cleanup effort of old duplicate content on my site, but trying to make sure I'm doing enough. My main concern now is a large group of landing pages. For example: http://www.boxerproperty.com/lease-office-space/office-space/dallas http://www.boxerproperty.com/lease-office-space/executive-suites/dallas http://www.boxerproperty.com/lease-office-space/medical-space/dallas And these are just the tip of the iceberg. For now, I've put canonical tags on each sub-page to direct to the main market page (the second two both point to the first, http://www.boxerproperty.com/lease-office-space/office-space/dallas for example). However this situation is in many other cities as well, and each has a main page like the first one above. For instance: http://www.boxerproperty.com/lease-office-space/office-space/atlanta http://www.boxerproperty.com/lease-office-space/office-space/chicago http://www.boxerproperty.com/lease-office-space/office-space/houston Obviously the previous SEO was pretty heavy-handed with all of these, but my question for now is should I even bother with canonical tags for all of the sub-pages to the main pages (medical-space or executive-suites to office-space), or is the presence of all these pages problematic in itself? In other words, should http://www.boxerproperty.com/lease-office-space/office-space/chicago and http://www.boxerproperty.com/lease-office-space/office-space/houston and all the others have canonical tags pointing to just one page, or should a lot of these simply be deleted? I'm continually finding more and more sub-pages that have used the same template, so I'm just not sure the best way to handle all of them. Looking back historically in Analytics, it appears many of these did drive significant organic traffic in the past, so I'm going to have a tough time justifying deleting a lot of them. Any advice?
Algorithm Updates | | BoxerPropertyHouston0 -
Google indexing my website's Search Results pages. Should I block this?
After running the SEOmoz crawl test, i have a spreadsheet of 11,000 urls of which 6381 urls are search results pages from our website that have been indexed. I know I've read that /search should be blocked from the engines, but can't seem to find that information at this point. Does anyone have facts behind why they should be blocked? Or not blocked?
Algorithm Updates | | Jenny10 -
Has there been a Google change in the last 24 hours?
We have come in this morning to find our site (paydayuk.co.uk) has suddenly disappeared from their SERPs, we have consistently been ranking in the top 5 for a wide range of search terms but now do not even appear for our brand name of Payday UK where we have been first for many months. Our site is still indexed and we have made no changes for a while as any SEO work is waiting on completion of a CMS system. Looking in https://groups.google.com/a/googleproductforums.com/forum/#!categories/webmasters/crawling-indexing--ranking and there seem to be a lot of people having the same issues but as of yet no answers. I'd also like to add we don’t use black hat techniques so we really don’t understand why we have been penalised. Can anyone help please?
Algorithm Updates | | Sarbs0 -
Problems with Google results
Hi Everybody, I ve been dealing with this issue for a while now. i have a multilingual website: www.vallnord.com When a search for Vallnord in Google it always shows the result in Catalan, but it does not show what I specified in the meta description, it displays what it crawls from the home page. I have 2 problems here: It is not showing my meta description. What can I do? It is not showing the language from which the search was made. Example: if you search from Google.com and your default language is english it should been displayed the result from the english HTML. www.vallnord.com/en but it is not like this. It is always the catalan (default language of the site) the one that is displayed. I have tried several things already: Inserting the Hreflang function Changing the descriptions Resubmitting the sitemap via Google Webmaster I can not figure out what is going on because if you search: "Vallnord Castellano" it will display the spanish URL but still not the proper description. Moreover if you search: "www.vallnord.com/es" on google , it will display the proper URL and description. FYI, I am using 301 redirects for the languages: es.vallnord.com it is the sames as www.vallnord.com/es In addition to this, If using Yahoo search engine there is no problem. it will show the proper language. from yahoo.com the first result is in english and from yahoo.es the first result Spanish. So any idea what would be the problem?And furthermore, any Idea which would be the solution? Thanks in advance, Guido.
Algorithm Updates | | SilbertAd0 -
Changes in Sitemap Indexation in GWT?
I've noticed some significant changes in the number and percentage of indexed URLs for the sitemaps we've been submitting to Google. I've been tracking these numbers directly from Google Webmaster Tools>Site Configuration>Sitemaps. We've made some changes that could be causing the changes we're seeing, but I want to confirm that this wasn't just a change in the way Google reports the indexation. Has anyone else noticed major changes, greater than a 30% change, in the indexation of your sitemaps in the past week? Thanks, Joe
Algorithm Updates | | JoeAmadon0