Should I use noindex or robots to remove pages from the Google index?
-
I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed.
From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else.
I can add the noindex tag to the review pages but they wont be crawled.
https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html
Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?
-
Thanks, Logan!
-
Rhys,
Your web dev team is confused. You cannot de-index by simply disallowing them in your robots.txt file. Google will still index anything they find (that doesn't have a noindex tag) from a link, this is the reason you often see search results that say "A description for this result is not available because of this site's robots.txt" as the description.
Here's a quote from Google regarding the subject: "You should not use robots.txt as a means to hide your web pages from Google Search results." - https://support.google.com/webmasters/answer/6062608?hl=en
-
Hi all,
Sorry to jump in here but I've been told the opposite by our web dev team. We're removing indexed 404s at the moment, and our web dev team said we simply need to add robots.txt to the pages and they'll be de-indexed. If this incorrect? I thought I'd need to add a noindex tag but was argued down...
Cheers,
Rhys
-
Hi there. Good question and one that comes up a lot.
You need to do the following:
- Put the noindex on those pages
- Remove the block in robots.txt
- Monitor these pages falling out of the index
- Once they are all out, then put the block back in place
You both want them to a) drop out and b) then not be crawled, so the above will take care of that for you.
Hope that helps!
John
-
Thanks.
That is what I figured just wanted to double check.
-
Hi Tyler,
Yes, remove the robots.txt disallow for that section and add a noindex tag. Noindex is the only sure-fire way to de-index URLs, but the crawlers need to be allowed to crawl those pages to see the tag.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Indexed Old Backups Help!
I have the bad habit of renaming a html page sitting on my server, before uploading a new version. I usually do this after a major change. So after the upload, on my server would be "product.html" as well as "product050714".html. I just stumbled on the fact G has been indexing these backups. Can I just delete them and produce a 404?
Intermediate & Advanced SEO | | alrockn0 -
Wordpress Tag Pages - NoIndex?
Hi there. I am using Yoast Wordpress Plugin. I just wonder if any test have been done around the effects of Index vs Noindex for Tag Pages? ( like when tagging a word relevant to an article ) Thanks 🙂 Martin
Intermediate & Advanced SEO | | s_EOgi_Bear0 -
Huge Google index on E-commerce site
Hi Guys, I got a question which i can't understand. I'm working on a e-commerce site which recently got a CMS update including URL updates.
Intermediate & Advanced SEO | | ssiebn7
We did a lot of 301's on the old url's (around 3000 /4000 i guess) and submitted a new sitemap (around 12.000 urls, of which 10.500 are indexed). The strange thing is.. When i check the indexing status in webmaster tools Google tells me there are over 98.000 url's indexed.
Doing the site:domainx.com Google tells me there are 111.000 url's indexed. Another strange thing which another forum member describes here : Cache date has been reverted And next to that old url's (which have a 301 for about a month now) keep showing up in the index. Does anyone know what i could do to solve the problem?0 -
No index.no follow certain pages
Hi, I want to stop Google et al from finding a some pages within my website. the url is www.mywebsite.com/call_backrequest.php?rid=14 As these pages are creating a lot of duplicate content issues. Would the easiest solution be to place a 'Nofollow/Noindex' META tag in page www.mywebsite.com/call_backrequest.php many thanks in advance
Intermediate & Advanced SEO | | wood1e19680 -
How to remove an entire site from Google?
Hi people, I have a site with around 2.000 urls indexed in google, and 10 subdomains indexed too, which I want to remove entirely, to set up a new web. Which is the best way to do it? Regards!
Intermediate & Advanced SEO | | SeoExpertos0 -
To noindex or not to noindex
Our website lets users test whether any given URL or keyword is censored in China. For each URL and keyword that a user looks up, a page is created, such as https://en.greatfire.org/facebook.com and https://zh.greatfire.org/keyword/freenet. From a search engines perspective, all these pages look very similar. For this reason we have implemented a noindex function based on certain rules. Basically, only highly ranked websites are allowed to be indexed - all other URLs are tagged as noindex (for example https://en.greatfire.org/www.imdb.com). However, we are not sure that this is a good strategy and so are asking - what should a website with a lot of similar content do? Don't noindex anything - let Google decide what's worth indexing and not. Noindex most content, but allow some popular pages to be indexed. This is our current approach. If you recommend this one, we would like to know what we can do to improve it. Noindex all the similar content. In our case, only let overview pages, blog posts etc with unique content to be indexed. Another factor in our case is that our website is multilingual. All pages are available (and equally indexed) in Chinese and English. Should that affect our strategy?References:https://zh.greatfire.orghttps://en.greatfire.orghttps://www.google.com/search?q=site%3Agreatfire.org
Intermediate & Advanced SEO | | GreatFire.org0 -
Is 404'ing a page enough to remove it from Google's index?
We set some pages to 404 status about 7 months ago, but they are still showing in Google's index (as 404's). Is there anything else I need to do to remove these?
Intermediate & Advanced SEO | | nicole.healthline0 -
Google replacing subpages in index with home page?
Hi! I run a backlink building company. Recently, we had a customer who had us build targeted backlinks to certain subpages on his site. Then something really bizarre happened...all of a sudden, their subpages that were indexed in Google (the ones we were building links to) disappeared from the index, to be replaced with their home page. They haven't lost their rank, per se--it's just now their home page instead of their subpages. At this point, we are tracking literally thousands of keywords for our link building customers, and we've never run into this issue before. Have you ever run into it? If so, what's the best way to handle it from an SEO company perspective? They have a sitemap.xml and their GWT account reports no crawl errors, so it doesn't seem to be a site issue.
Intermediate & Advanced SEO | | ownlocal0