Tens of duplicate homepages indexed and blocked later: How to remove from Google cache?
-
Hi community,
Due to some WP plugin issue, many homepages indexed in Google with anonymous URLs. We blocked them later. Still they are in SERP. I wonder whether these are causing some trouble to our website, especially as our exact homepages indexed. How to remove these pages from Google cache? Is that the right approach?
Thanks
-
Hi Nigel,
Thanks for the suggestion. I'm going to use "Remove URLs" tool from GSC. They have been created due to a bug in the Yoast SEO plugin. Very unfortunate and we paid for no mistake from our end.
Removing from SERP means removing from Google index also? Or Google will still consider them and just stops showing us? My intention is: Anyway we blocked them, but whether they will cause some distraction to our ranking efforts being there in results being cached.
Thanks
-
Thanks!
A agree - I have just done a similar clean up by:
1. Don't let them be created
2. Redirect all previous versions!One site I just worked on had 8 versions of the home page! lol
http
https
/index.php
/index.php/A mess!
We stopped them all being created and 301'd all versions just in case they were indexed anywhere or linked externally.
Cheers
-
It is assuredly true that, just like in any number of fields (medicine) - in SEO, prevention is better than cleanup based methodology. If your website doesn't take its medicine, you get problems like this one
I think your advice here was really good
-
Good solid advice
They can be created in any number of ways but it's normally simple enough to specify the preferred URL on the server then move any variations in htaccess, such as those with www (if the none www is preferred), those with a trailing slash at the end etc.
The self canonical on all will sort out any other duplicates.
As for getting rid of them - the search console way is the quickest. If they don't exist after that then the won't be reindexed unless they are linked from somewhere else. In such cases, they will 301 from htaccess so it shouldn't be a problem.
if you 410 you will lose any benefit from those links going to the pages and it's a bad experience for a visitor. Always 301 do not 410 if it is a version.
410s are fine for old pages you never want to see in the index again but not for a home page version.
Regards
Nigel
-
It's likely that you don't have access to edit the coding on these weird plugin URLs. As such, normal techniques like using a Meta no-index tag in the HTML may be non-viable.
You could use the HTTP header (server level stuff) to help you out. I'd advise adding two strong directives to the afflicted URLs through the HTTP header so that Google gets the message:
-
Use the X-Robots deployment of the no-index directive on the affected URLs, at the HTTP header (not the HTML) level. That linked pages tells you about the normal HTML implementation, but also about the X-Robots implementation which is the one you need (scroll down a bit)
-
Serve status code 410 (gone) on the affected URLs
That should prompt Google to de-index those pages. Once they are de-indexed, you can use robots.txt to block Google from crawling such URLs in the future (which will stop the problem happening again!)
It's important to de-index the URLs before you do any robots.txt stuff. If Google can't crawl the affected URLs, it can't find the info (in the HTTP header) to know that it should de-index those pages
Once Google is blocked from both indexing and crawling these pages, they should begin to stop caching them too
Hope that helps
-
-
+1 for "Make sure that they are not created in the first place" haha
-
Hi again vtmoz!
1. Make sure that they are not created in the first place
2. Make sure that they are not in the sitemap
3. Go to search console and remove any you do not want - it will say temporary removal but they will not come back if they are not in the structure or the sitemap.More:
https://support.google.com/webmasters/answer/1663419?hl=en
Note: Always self canonicalize the home page to stop versions with UTM codes (created by Facebook, Twitter etc) appearing in SERPS
Regards
Nigel
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are we confusing Google with our internal linking?
Hi all, We decided to give importance to one of our top pages as it has "keyword" in it's slug like website.com/keyword. So we internally linked even from different sub-domain pages more than homepage to rank for that "keyword". But this page didn't show up in Google results for that "keyword"; neither homepage, but our login page is ranking. We wonder why login page is ranking. Has our internal linking plan confused Google to ignore homepage to rank for that primary keyword? And generally do we need to internally link homepage more than anyother page? Thanks
Algorithm Updates | | vtmoz0 -
SEO - Google Local Listing & Same Day Delivery
Hi We are looking to offer same day delivery if you're in a 20 mile radius to us. I'm trying to do some research on how to optimise this for Google organic listings. Would this be the same as optimising for a local business listing? I'm not sure where to start. Thanks! Becky
Algorithm Updates | | BeckyKey0 -
Delay between being indexed and ranking for new pages.
I've noticed with the last few pages i've built that there's a delay between them being indexed and them actually ranking. Anyone else finding that? And why is it like that? Not much of an issue as they tend to pop up after a week or so, but I am curious. Isaac.
Algorithm Updates | | isaac6630 -
Staging site - Treated as duplicate?
Last week (exactly 8 days ago to be precise) my developer created a staging/test site to test some new features. The staging site duplicated the entire existing site on the same server. To explain this better -My site address is - www.mysite.com The path of the new staging site was www.mysite/staging I realized this only today and have immediately restricted robot text and put a no index no follow on the entire duplicate server folder but I am sure that Google would have indexed the duplicate content by now? So far I do not see any significant drop in traffic but should I be worried? and what if anything can I do at this stage?
Algorithm Updates | | rajatsharma0 -
Site not indexed on Google UK after 4 days?
Hello!
Algorithm Updates | | digitalsoda
Wonder if anyone can help with this one? I have an ecommerce site www.doggydazzles.co.uk which went live on Friday and was submitted to Google via webmaster tools on saturday morning, but I can't find any trace of it in a google search?
I'm a bit stuck with this as its never happened to any of my other sites.
Can anyone help please or make suggestions as to what I can do to get ranked quicker? Thanks0 -
Are all duplicate contents bad?
We were badly hit by Panda back in January 2012. Unfortunately, it is only now that we are trying to recover back. CASE 1:
Algorithm Updates | | Gautam.Jain
We develop software products. We send out 500-1000 word description about the product to various download sites so that they can add to their product listing. So there are several hundred download sites with same content. How does Google view this? Did Google penalize us due to this reason? CASE 2: In the above case the product description does not match with any content on our website. However, there are several software download sites that copy and paste the content from our website as the product description. So in this case, the duplicate content match with our website. How does Google view this? Did Google penalize us due to this reason? Along with all the download sites, there are also software piracy & crack sites that have the duplicate content. So, should I remove duplicate content only from the software piracy & crack sites or also from genuine download sites? Does Google reject all kind of duplicate content? Or it depends on who hosts the duplicate content? Confused 😞 Please help.0 -
Google Page Rank?
We have had a quality website for 12 years now, and it seems no matter how many more links we get and how much new content we add daily, we have stayed at PR3 for the past 10 years or so. Our SEOMoz domain authority is 52. We have over 950,000 pages linking to us from 829 unique root domains. Is this in line with PR3 or should we be approaching PR4 soon? We do daily blog posts with all unique, fresh quality content that has not been published elsewhere. We try to do everything with 'white hat' methods, and we are constantly trying to provide genuine content and high quality products, and customer service. How can we improve our PR and how important is PR today?
Algorithm Updates | | applesofgold0 -
If you got hit by a google update what you do first?
I have not been known the Panda. My site statistics knows it. I manage to endure and get back where i felt everytime with change of design, ad placement, content... I am very curious: if i did nothing, my site will turn back? why i kicked in every big update? Simply if you get hit by update, without any act, could your site healed?
Algorithm Updates | | MaxCrandale0