Why is google not deindexing pages with the meta noindex tag?
-
On our website www.keystonepetplace.com we added the meta noindex tag to category pages that were created by the sorting function.
Google no longer seems to be adding more of these pages to the index, but the pages that were already added are still in the index when I check via site:keystonepetplace.com
Here is an example page: http://www.keystonepetplace.com/dog/dog-food?limit=50
How long should it take for these pages to disappear from the index?
-
Google might have already crawled the pages but not indexed them yet. Be patient , if you have enough links coming in and the pages are less than 3 levels deep they will all be crawled and indexed in no time.
-
I guess it depends on the urgency of your situation. If you were just trying to clean things up then it's okay to wait for Google to re-crawl and solve the problem. But if you have been affected by panda and your site is not ranking then I personally would consider that an urgent enough need to use the tool.
-
This link almost makes it seem like I shouldn't use the webmaster tools removal.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1269119
-
The crawlers have so many billions of webpages to get to. We have more than 50,000 on our site; there's about 8,000 that they check more regularly than the others - some are just really deep on the site and hard to get to.
-
You can remove entire category directories from the index in one command using the tool. But the urls won't be removed from the cache, just the index. To remove them from the cache you'll need to enter each url individually. I think that if you are trying to clear things up for Panda reasons, just removing from the index is enough. However, I'm currently trying to decide if it will speed things up to remove from the cache as well.
-
Ok. That makes sense. I wonder why it takes so long? I'll start the long process of the manual removal.
-
Streamline Metrics has got it right.
I've seen pages take MONTHS to drop out of the index after being noindexed. It's best to use the URL removal tool in WMT (not to be confused with the disavow tool) to tell Google to not only deindex the pages but to remove them from the cache as well. I have found that when you do this the pages are gone within 12 hours.
-
In your experience how long does this normally take?
-
Yes it was around December 2nd or 3rd that we added the noindex tags. It just seemed like google wasn't removing any pages yet from the index. It did stop google from adding more of these pages though.
-
It all depends on how long it takes Google to re-crawl those pages with the no index tag on them.
I would do this along with the steps you have already taken in order to help speed the process up if you are in a hurry
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1663419
-
Do you know when you added the noindex tags? Google will need to recrawl the pages to see the noindex tags before removing them. I just looked at one your category pages and it looks like it was cached by Google on December 1st, and there was no noindex tag on that page. Depending on how big your site is and how often your site is crawled will determine when they will be removed from the index. Here's Google's official explanation -
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. Other search engines, however, may interpret this directive differently. As a result, a link to the page can still appear in their search results.
Note that because we have to crawl your page in order to see the noindex meta tag, there's a small chance that Googlebot won't see and respect the noindex meta tag. If your page is still appearing in results, it's probably because we haven't crawled your site since you added the tag. (Also, if you've used your robots.txt file to block this page, we won't be able to see the tag either.)
If the content is currently in our index, we will remove it after the next time we crawl it. To expedite removal, use the URL removal request tool in Google Webmaster Tools."
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
-
Or canonical or by robots.txt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Dynamic Canonical Tag for Search Results Filtering Page
Hi everyone, I run a website in the travel industry where most users land on a location page (e.g. domain.com/product/location, before performing a search by selecting dates and times. This then takes them to a pre filtered dynamic search results page with options for their selected location on a separate URL (e.g. /book/results). The /book/results page can only be accessed on our website by performing a search, and URL's with search parameters from this page have never been indexed in the past. We work with some large partners who use our booking engine who have recently started linking to these pre filtered search results pages. This is not being done on a large scale and at present we only have a couple of hundred of these search results pages indexed. I could easily add a noindex or self-referencing canonical tag to the /book/results page to remove them, however it’s been suggested that adding a dynamic canonical tag to our pre filtered results pages pointing to the location page (based on the location information in the query string) could be beneficial for the SEO of our location pages. This makes sense as the partner websites that link to our /book/results page are very high authority and any way that this could be passed to our location pages (which are our most important in terms of rankings) sounds good, however I have a couple of concerns. • Is using a dynamic canonical tag in this way considered spammy / manipulative? • Whilst all the content that appears on the pre filtered /book/results page is present on the static location page where the search initiates and which the canonical tag would point to, it is presented differently and there is a lot more content on the static location page that isn’t present on the /book/results page. Is this likely to see the canonical tag being ignored / link equity not being passed as hoped, and are there greater risks to this that I should be worried about? I can’t find many examples of other sites where this has been implemented but the closest would probably be booking.com. https://www.booking.com/searchresults.it.html?label=gen173nr-1FCAEoggI46AdIM1gEaFCIAQGYARS4ARfIAQzYAQHoAQH4AQuIAgGoAgO4ArajrpcGwAIB0gIkYmUxYjNlZWMtYWQzMi00NWJmLTk5NTItNzY1MzljZTVhOTk02AIG4AIB&sid=d4030ebf4f04bb7ddcb2b04d1bade521&dest_id=-2601889&dest_type=city& Canonical points to https://www.booking.com/city/gb/london.it.html In our scenario however there is a greater difference between the content on both pages (and booking.com have a load of search results pages indexed which is not what we’re looking for) Would be great to get any feedback on this before I rule it out. Thanks!
Technical SEO | | GAnalytics1 -
Google indexes page elements
Hello We face this problem that Google indexes page elements from WordPress as single pages. How can we prevent these elements from being indexed separately and being displayed in the search results? For example this project: www.rovana.be When scrolling down the search results, there are a lot of elements that are indexed separately. When clicking on the link, this is wat we see (see attachements) Does anyone have experience with this way of indexing and how can we solve this problem? Thanks! LlAWG4w.png C7XDDYS.png gVroomx.png
Technical SEO | | conversal0 -
Google not returning an international version of the page
I run a website that duplicates some content across international editions. These are differentiated by the country codes e.g. /uk/folder/article1/ /au/folder/article1/ The UK version is considered the origin of the content. We currently use hreflang to differentiate content, however there is no actual regional or language variation between the content on these pages. Recently the UK version of a specific article is being indexed by Google as I am able to access via keyword search, however when I try to search for it via: site:domain.com/uk/folder/article1/then it is not displaying, however the AU version is. Identical articles in the same folder are not having this issue. There are no errors within webmaster tools and I have recently refetched the specific URL. Additionally when checking for internal links to the UK and AU edition of the article, I am getting internal links for the AU edition of the article however no internal links for the UK edition of the article. The main reason why this is problematic is because the article is now no longer appearing on the UK edition of the site for internal site search. How can I find out why Google is not getting a result when the URL is entered but it is coming up when doing a specific search?
Technical SEO | | AndDa0 -
Canonical Tags on Parameter Pages With Hreflang
Hey Everyone: We are currently implementing hreflang tags on our site, and we have many parameter pages with hreflang tags; however, I am afraid these may be counted as duplicate content without canonical tags. example.com/utm_source=tpi href='http://example.com/de" hreflang="de" rel="alternate" href='http://example.com/nl" hreflang="nl" rel="alternate" href='http://example.com/fr" hreflang="fr" rel="alternate" href='http://example.com/it" hreflang="it" rel="alternate" I have two questions 1. Do I need a canonical tag pointing to example.com ? 2. On the homepage without the parameter, should I add self referencing hreflang tags? (href="http://example.com/" hreflang="es" Thanks so much for your help! Kyle
Technical SEO | | TeespringMoz0 -
Duplicated rel=author tags (x 3) on WordPress pages, any issue with this?
Hi,
Technical SEO | | jeffwhitfield
We seem to have duplicated rel=author tags (x 3) on WordPress pages, as we are using Yoast WordPress SEO plugin which adds a rel=author tag into the head of the page and Fancier Author Box plugin which seems to add a further two rel=author tags toward the bottom of the page. I checked the settings for Fancier Author Box and there doesn't seem to be the option to turn rel=author tags off; we need to keep this plugin enabled as we want the two tab functionality of the author bio and latest posts. All three rel=author tags seem to be correctly formatted and Google Structured Data Testing Tool shows that all authorship rel=author markup is correct; is there any issue with having these duplicated rel=author tags on the WordPress pages?
I tried searching the Q&A but couldn't find anything similar enough to what I'm asking above. Many thanks in advance and kind regards.0 -
If content is at the bottom of the page but the code is at the top, does Google know that the content is at the bottom?
I'm working on creating content for top category pages for an ecommerce site. I can put them under the left hand navigation bar, and that content would be near the top in the code. I can also put the content at the bottom center, where it would look nicer but be at the bottom of the code. What's the better approach? Thanks for reading!
Technical SEO | | DA20130 -
Special Characters in Title Tags & Meta Descriptions
Do special characters, such as the "&" symbol or a "," in title tags and meta descriptions negatively affect your ranking in search engines? Any feedback is much appreciated. Thank you!
Technical SEO | | ZAG1 -
How to stop my webmail pages not to be indexed on Google ??
when i did a search in google for Site:mywebsite.com , for a list of pages indexed. Surprisingly the following come up " Webmail - Login " Although this is associated with the domain , this is a completely different server , this the rackspace email server browser interface I am sure that there is nothing on the website that links or points to this.
Technical SEO | | UIPL
So why is Google indexing it ? & how do I get it out of there. I tried in webmaster tool but I could not , as it seems like a sub-domain. Any ideas ? Thanks Naresh Sadasivan0