Tens of duplicate homepages indexed and blocked later: How to remove from Google cache?
-
Hi community,
Due to some WP plugin issue, many homepages indexed in Google with anonymous URLs. We blocked them later. Still they are in SERP. I wonder whether these are causing some trouble to our website, especially as our exact homepages indexed. How to remove these pages from Google cache? Is that the right approach?
Thanks
-
Hi Nigel,
Thanks for the suggestion. I'm going to use "Remove URLs" tool from GSC. They have been created due to a bug in the Yoast SEO plugin. Very unfortunate and we paid for no mistake from our end.
Removing from SERP means removing from Google index also? Or Google will still consider them and just stops showing us? My intention is: Anyway we blocked them, but whether they will cause some distraction to our ranking efforts being there in results being cached.
Thanks
-
Thanks!
A agree - I have just done a similar clean up by:
1. Don't let them be created
2. Redirect all previous versions!One site I just worked on had 8 versions of the home page! lol
http
https
/index.php
/index.php/A mess!
We stopped them all being created and 301'd all versions just in case they were indexed anywhere or linked externally.
Cheers
-
It is assuredly true that, just like in any number of fields (medicine) - in SEO, prevention is better than cleanup based methodology. If your website doesn't take its medicine, you get problems like this one
I think your advice here was really good
-
Good solid advice
They can be created in any number of ways but it's normally simple enough to specify the preferred URL on the server then move any variations in htaccess, such as those with www (if the none www is preferred), those with a trailing slash at the end etc.
The self canonical on all will sort out any other duplicates.
As for getting rid of them - the search console way is the quickest. If they don't exist after that then the won't be reindexed unless they are linked from somewhere else. In such cases, they will 301 from htaccess so it shouldn't be a problem.
if you 410 you will lose any benefit from those links going to the pages and it's a bad experience for a visitor. Always 301 do not 410 if it is a version.
410s are fine for old pages you never want to see in the index again but not for a home page version.
Regards
Nigel
-
It's likely that you don't have access to edit the coding on these weird plugin URLs. As such, normal techniques like using a Meta no-index tag in the HTML may be non-viable.
You could use the HTTP header (server level stuff) to help you out. I'd advise adding two strong directives to the afflicted URLs through the HTTP header so that Google gets the message:
-
Use the X-Robots deployment of the no-index directive on the affected URLs, at the HTTP header (not the HTML) level. That linked pages tells you about the normal HTML implementation, but also about the X-Robots implementation which is the one you need (scroll down a bit)
-
Serve status code 410 (gone) on the affected URLs
That should prompt Google to de-index those pages. Once they are de-indexed, you can use robots.txt to block Google from crawling such URLs in the future (which will stop the problem happening again!)
It's important to de-index the URLs before you do any robots.txt stuff. If Google can't crawl the affected URLs, it can't find the info (in the HTTP header) to know that it should de-index those pages
Once Google is blocked from both indexing and crawling these pages, they should begin to stop caching them too
Hope that helps
-
-
+1 for "Make sure that they are not created in the first place" haha
-
Hi again vtmoz!
1. Make sure that they are not created in the first place
2. Make sure that they are not in the sitemap
3. Go to search console and remove any you do not want - it will say temporary removal but they will not come back if they are not in the structure or the sitemap.More:
https://support.google.com/webmasters/answer/1663419?hl=en
Note: Always self canonicalize the home page to stop versions with UTM codes (created by Facebook, Twitter etc) appearing in SERPS
Regards
Nigel
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Proactively Use GWT Removal Tool?
I have a bunch of links on my site from sexualproblems.net (not a porn site, it's a legit doctor's site who I've talked to on the phone in America). The problem is his site got hacked and has tons of links on his homepage to other pages, and mine is one of them. I have asked him multiple times to take the link down, but his webmaster is his teenage son, who doesn't basically just doesn't feel like it. My question is, since I don't think they will take the link down, should I proactively remove it or just wait till I get a message from google? I'd rather not tell google I have spam links on my site, even if I am trying to get them removed. However, I have no idea if that's a legitimate fear or not. I could see the link being removed and everything continuing fine or I could see reporting the removal request as signaling a giant red flag for my site to be audited. Any advice? Ruben
Algorithm Updates | | KempRugeLawGroup0 -
Duplicate Content?
My client is a manufacturers representative for highly technical controls. The manufacturers do not sell their products directly, relying on manufacturers reps to sell and service them. Most but not all of them publish their specs on their sites, usually in PDF only. As a service to our customers and with permission of the manufacturers we publish the manufacturers specs on our site for our customers in HTML with images and downloadable PDF's — this constitutes our catalogue. The pages are lengthy and technical, and are pretty much the opposite of thin content. The URLS for these (technical) queries rank well, so Google doesn't seem to mind. Does this constitute duplicate content and can we be penalized for it?
Algorithm Updates | | waynekolenchuk0 -
Number of Items As a Google Ranking Factor??
If I search for "hiking boots" and scan down the SERPs I see the following... Google reports "483 items" for the Zappos.com page. Google reports "Results 1 - 36 of 85" for the Shoebuy.com page (and that does not appear in their code). So, Google is obviously paying attention to the depth of your information or the number of items that you are showing. If they think that is important enough to count and report in the SERPs, might they also be using that information as a ranking factor?? PRACTICAL APPLICATION FOR SEO: If google is using this information, perhaps people should list all of their color, size, etc variants on a single page. For example if you sell widgets in five colors, instead of making one page for each color, list all five on the same page.
Algorithm Updates | | EGOL1 -
Does a KML file have to be indexed by Google?
I'm currently using the Yoast Local SEO plugin for WordPress to generate my KML file which is linked to from the GeoSitemap. Check it out http://www.holycitycatering.com/sitemap_index.xml. A competitor of mine just told me that this isn't correct and that the link to the KML should be a downloadable file that's indexed in Google. This is the opposite of what Yoast is saying... "He's wrong. 🙂 And the KML isn't a file, it's being rendered. You wouldn't want it to be indexed anyway, you just want Google to find the information in there. What is the best way to create a KML? Should it be indexed?
Algorithm Updates | | projectassistant1 -
Has Google lost its mind? I am the only link in every SERP for a query?
I run a small online debutante dress store and have been doing some onsite seo recently. Anyways, when I search for the search query "deb dress style guide" my site is the only search result for the first three pages of Google Australia (my target market). Just endless links to my site. I have competitors in my niche who all have websites worthy of listing in the SERP as shown when you google "deb dresses". Can anyone explain whats going on?
Algorithm Updates | | mydebdress20 -
Google's not indexing my blog posts anymore! Why?
Google just recently stopped indexing my blog posts immediately after being published, why could this be? I would usually post a blog post and it would be in google results within 45 seconds, now they don't show up until 6 hours later, if at all (a few never even showed up). Also, my home page doesn't even refresh when I make a change to the site. My site is CantStopHipHop [dot] comI have all in one SEO, xml sitemap generator, and webmaster tools and nothing seemed irregular in the settings.I appreciate any thoughts/help/suggestions.
Algorithm Updates | | bb2550 -
What do you think Google analyzes for SERP ranking?
I've been doing some research trying to figure out how the Google algorithm works. The one thing that is constant is that nothing is constant. This makes me believe that Google takes a variable that all sites have and divides it by that number. One example would be taking the load time in MS and dividing it by the total number or points the website scored. This would give all of the websites a random appearance since there that variable would throw off all the other constants. I'm going to continue doing research but I was wondering what you guys think matters in the Google Algorithm. -Shane
Algorithm Updates | | Seoperior0 -
301 Redirect has removed search rankings
As per instructions from a SEO , we did a 301 redirect on our url to a new url (www.domain.com to subdomain xxxx.domain.com). But the problem is we lost all the google rankings that the previous url had gained. How can we rollback this situation. Can we retrieve the rankings of the previous url if we remove 301 permenant move redirection ? The new url does not figure in the google search for the keyword that use to fetch the previous url at no 3 in the results Please help ...
Algorithm Updates | | BizSparkSEO0