Robots.txt vs. meta noindex, follow
-
Hi guys,
I wander what your opinion is concerning exclution via the robots.txt file.
Do you advise to keep using this? For example:User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/*Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is.Regards,
Tom Vledder -
Hi Tom
Agree with Martijn that it depends for example, the robots.txt is generally the first port of call for bots as it allows them to understand where you want them to spend their finite time crawling your site. You can aslo give direction to all bots at once or specify a subset. It is generally the best option for blocking pages such as you /cart/ etc were they don't need crawling.
The problem with robots.txt is that it doesn't always keep pages from being indexed especially if there are other external sources linking to the pages in question.
The meta tag noindex on the other hand can be applied to individual pages and you are actually commanding the robots to NOT Index the relevant page in serps, use this option if you have pages you don't want appearing in Google (or other search engines) but the page may still be relevant for authority or able to acquire links (make sure to use Noindex follow) as you still want the robots to crawl the page. Otherwise use Noindex Nofollow hope that this helps.
-
Hi Tom,
It depends, for the /sale/ I would make an exception to make sure that it could be sales pages. But for the other pages I wouldn't want a search engine to waste any crawl budget by looking at these pages for a start. That's why I would go there with a robots.txt implementation instead of META robots as then they'll still visit the page to figure out there they won't index the page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Meta description issue on Google
Hello, I have a small issue on Google with our Meta Description tag not always being properly displayed. If you search for the term: Globe Car (in two words), everything is being displayed properly: http://screencast.com/t/YQCUkJnk Now do the same search for the term GlobeCar (in one word) and the meta tag set into our homepage seems to be totallly ignored and Google is now displaying something that is generated from out of their hat: http://screencast.com/t/K0KeeRGSgspV Anyone has an idea what would cause this? Thanks!
Technical SEO | | GlobeCar1 -
Is there any value in having a blank robots.txt file?
I've read an audit where the writer recommended creating and uploading a blank robots.txt file, there was no current file in place. Is there any merit in having a blank robots.txt file? What is the minimum you would include in a basic robots.txt file?
Technical SEO | | NicDale0 -
Meta Description
Working with a business that is having some real issues. They had some client information that was showing up in the meta description. Personal phone numbers for example. Our developers removed all the information from the pages in question two days ago, but we are still seeing the info in the meta description. Any idea how long this will take to be recrawled and fixed? Anything I can do to get recrawled sooner? Also, this is only happening in Bing/Yahoo and not in Google. Thanks for any help you can provide!
Technical SEO | | PGD20110 -
Duplicate Meta Description in GWMT
We've just discovered that there are multiple duplicate URLs indexed for a site that we're working on. It seems that when new versions of the site was developed in the last couple of years, there were new page names and URL structures that were used. All of these seem to be showing up as Duplicate Meta Descriptions in Google's WMT, which is not surprising as they are basically the same page with the same content that are just sitting on different page names/URLs. This is an example of the situation, where URL 5 is the current version. Note: all the others are still live and resolve, although they are not linked to from the current site. URL 1: www.example.com/blue-tshirts.html (Version 1 - January 2010) URL 2: www.example.com/blue-t-shirts.html (Version 2 - July 2010) URL 3: www.example.com/blue_t_shirts.html (Version 3 - November 2010) URL 4: www.example.com/buy/blue_tshirts.html (Version 4 - January 2011) URL 5: www.example.com/buy/bluetshirts.html (Version 5 - April 2011) Presumably, this is a clear case of duplicate content. QUESTION: In order to solve it, shall we 301 all of the previous URLs to the current one - ie. Redirect URLs 1-4 to URL 5? Or, should some of them be NoIndexed? To complicate matters, there is Pagination on most of them. For example: URL 1: www.example.com/blue-tshirts.html (Version 1 - January 2010) URL 1a: www.example.com/page-1/blue-tshirts.html URL 1b: www.example.com/page-2/blue-tshirts.html URL 1c: www.example.com/page-3/blue-tshirts.html URL 4: www.example.com/buy/blue_tshirts.html URL 4a: www.example.com/buy/page-1/blue_tshirts.html URL 4b: www.example.com/buy/page-2/blue_tshirts.html URL 4c: www.example.com/buy/page-3/blue_tshirts.html URL 5: www.example.com/buy/bluetshirts.html URL 5a: www.example.com/buy/page-1/bluetshirts.html URL 5b: www.example.com/buy/page-2/bluetshirts.html URL 5c: www.example.com/buy/page-3/bluetshirts.html Since URL 5 is the current site, we are going to 'NoIndex, Follow' URLs 5a, 5b and 5c, which is what we understand to be the correct thing to do for paginated pages. QUESTION: What shall we do with URLs 1a, 1b and 1c? Should we apply the same "No Index, Follow" OR should they be 301'd to their respective counterparts in 5a, 5b and 5c? QUESTION: In the same way, since URL 4 is the version just before the current live Version 5, does it make a different on whether the paginated pages (ie 4a, 4b and 4c) should be No Indexed or 301'd? Thanks in advance for all responses and suggestions, it's greatly appreciated.
Technical SEO | | orangechew0 -
Root vs. Index.html
Should I redirect index.html to "/" or vice versa? Which is better for duplicate content issues?
Technical SEO | | DavetheExterminator0 -
Understanding No Follow
We manage a couple of sites with 100s of pages... Most of the sites have content that is not helpful as landing pages but obviously has relevent content related to our desired search terms. Some of links go off site to another domain. I am trying to understand the issue of "link juice" and if I gain it or lose it by putting "nofollow" designation on some of the page links. Specifically, do I increase the value of my pages if I put no follow tags on lower tier links off of these pages. Here is a page in question - http://www.vahmarketing.com/product/ductless-hoods Is there a best practice or SEO rule for using "no follow"? Thanks, Bob Nance
Technical SEO | | impressem0 -
Restricted by robots.txt and soft bounce issues (related).
In our web master tools we have 35K (ish) URLs that are restricted by robots.txt and as have 1200(ish) soft 404s. WE can't seem to figure out how to properly resolve these URLs so that they no longer show up this way. Our traffic from SEO has taken a major hit over the last 2 weeks because of this. Any help? Thanks, Libby
Technical SEO | | GristMarketing0 -
Should I set up a disallow in the robots.txt for catalog search results?
When the crawl diagnostics came back for my site its showing around 3,000 pages of duplicate content. Almost all of them are of the catalog search results page. I also did a site search on Google and they have most of the results pages in their index too. I think I should just disallow the bots in the /catalogsearch/ sub folder, but I'm not sure if this will have any negative effect?
Technical SEO | | JordanJudson0