Tens of duplicate homepages indexed and blocked later: How to remove from Google cache?
-
Hi community,
Due to some WP plugin issue, many homepages indexed in Google with anonymous URLs. We blocked them later. Still they are in SERP. I wonder whether these are causing some trouble to our website, especially as our exact homepages indexed. How to remove these pages from Google cache? Is that the right approach?
Thanks
-
Hi Nigel,
Thanks for the suggestion. I'm going to use "Remove URLs" tool from GSC. They have been created due to a bug in the Yoast SEO plugin. Very unfortunate and we paid for no mistake from our end.
Removing from SERP means removing from Google index also? Or Google will still consider them and just stops showing us? My intention is: Anyway we blocked them, but whether they will cause some distraction to our ranking efforts being there in results being cached.
Thanks
-
Thanks!
A agree - I have just done a similar clean up by:
1. Don't let them be created
2. Redirect all previous versions!One site I just worked on had 8 versions of the home page! lol
http
https
/index.php
/index.php/A mess!
We stopped them all being created and 301'd all versions just in case they were indexed anywhere or linked externally.
Cheers
-
It is assuredly true that, just like in any number of fields (medicine) - in SEO, prevention is better than cleanup based methodology. If your website doesn't take its medicine, you get problems like this one
I think your advice here was really good
-
Good solid advice
They can be created in any number of ways but it's normally simple enough to specify the preferred URL on the server then move any variations in htaccess, such as those with www (if the none www is preferred), those with a trailing slash at the end etc.
The self canonical on all will sort out any other duplicates.
As for getting rid of them - the search console way is the quickest. If they don't exist after that then the won't be reindexed unless they are linked from somewhere else. In such cases, they will 301 from htaccess so it shouldn't be a problem.
if you 410 you will lose any benefit from those links going to the pages and it's a bad experience for a visitor. Always 301 do not 410 if it is a version.
410s are fine for old pages you never want to see in the index again but not for a home page version.
Regards
Nigel
-
It's likely that you don't have access to edit the coding on these weird plugin URLs. As such, normal techniques like using a Meta no-index tag in the HTML may be non-viable.
You could use the HTTP header (server level stuff) to help you out. I'd advise adding two strong directives to the afflicted URLs through the HTTP header so that Google gets the message:
-
Use the X-Robots deployment of the no-index directive on the affected URLs, at the HTTP header (not the HTML) level. That linked pages tells you about the normal HTML implementation, but also about the X-Robots implementation which is the one you need (scroll down a bit)
-
Serve status code 410 (gone) on the affected URLs
That should prompt Google to de-index those pages. Once they are de-indexed, you can use robots.txt to block Google from crawling such URLs in the future (which will stop the problem happening again!)
It's important to de-index the URLs before you do any robots.txt stuff. If Google can't crawl the affected URLs, it can't find the info (in the HTTP header) to know that it should de-index those pages
Once Google is blocked from both indexing and crawling these pages, they should begin to stop caching them too
Hope that helps
-
-
+1 for "Make sure that they are not created in the first place" haha
-
Hi again vtmoz!
1. Make sure that they are not created in the first place
2. Make sure that they are not in the sitemap
3. Go to search console and remove any you do not want - it will say temporary removal but they will not come back if they are not in the structure or the sitemap.More:
https://support.google.com/webmasters/answer/1663419?hl=en
Note: Always self canonicalize the home page to stop versions with UTM codes (created by Facebook, Twitter etc) appearing in SERPS
Regards
Nigel
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When Google is going to penalise webpages with autoplaying videos?
Hi, One of the most disgusting experiences while browsing web pages is the auto playing anonymous and highly non relevant ad videos at some corner of the page which buffer mostly and automatically plays. There are familiar websites which been following this activity. I don't think Google has ever rolled out an update to penalise these. Any idea why? Thanks
Algorithm Updates | | vtmoz0 -
Google Adding / Manipulating Page Meta Titles?
We have a client who is experiencing some heavy google modification to the title tags being displayed on the search engine. It is adding "- 0 Reviews" to an ecommerce site. Obviously a bad start. There were no instances of these keywords anywhere on any of these pages, header tag or otherwise (on only a handful of the affected pages there was a single commented out image with an alt tag 0 reviews - but it was commented out and since removed) We have attempted to rewrite the title multiple times and it will modify the title but still include the non-relevant addition. Has anyone ever experienced anything like this?
Algorithm Updates | | Spindle0 -
Delay between being indexed and ranking for new pages.
I've noticed with the last few pages i've built that there's a delay between them being indexed and them actually ranking. Anyone else finding that? And why is it like that? Not much of an issue as they tend to pop up after a week or so, but I am curious. Isaac.
Algorithm Updates | | isaac6630 -
How can I tell Google two sites are non-competing?
We have two sites, both English language. One is a .ca and the other is a .com, I am worried that they are hurting one another in the search results. I'd like to obviously direct google.ca towards the .ca domain and .com towards the .com domain and let Google know they are connected sites, non-competing.
Algorithm Updates | | absoauto0 -
Why does Google Alerts call my website a blog?
Our company started a WordPress blog about 14 years ago. It has since added a third-party forum, a user-submitted photo gallery, and a huge database of searchable products. We also have almost 4000 posts. With all that said, Google Alerts often lists our content under blogs rather than websites. Sometimes it shows up in both? Does anyone know what criteria Google uses for determining the type of content, and how we can signal to them that we are a website?
Algorithm Updates | | TMI.com0 -
Is there a way to know what rank my site is listed on google ?
My current client web page was listed at the 4th page 1 month ago. Im trying real hard to make him understand that the traffic from beiing on the first page is important and that he need to give me additionnal ressource to make it happen ( i don't prog at all). So i had the idea of checking every page to see whats is current rank. but instead of looking from page 1 to page X, i was wondering if there was something somewhere that could give me my rank right away. It woud help saving time. Thx.
Algorithm Updates | | Promoteam0 -
Today all of our internal pages all but completely disappeared from google search results. Many of them, which had been optimized for specific keywords, had high rankings. Did google change something?
We had optimized internal pages, targeting specific geographic markets. The pages used the keywords in the url title, the h1 tag, and within the content. They scored well using the SEOmoz tool and were increasing in rank every week. Then all of a sudden today, they disappeared. We had added a few links from textlink.com to test them out, but that's about the only change we made. The pages had a dynamic url, "?page=" that we were about to redirect to a static url but hadn't done it yet. The static url was redirecting to the dynamic url. Does anyone have any idea what happened? Thanks!
Algorithm Updates | | h3counsel0 -
Anyone have stats on numbers of Google users searching while logged in?
In light of Google's recent "social search update", I am curious to know how many Google users perform searches while logged into their Google account thereby showing "social results".
Algorithm Updates | | Gyi0