Robots.txt vs. meta noindex, follow
-
Hi guys,
I wander what your opinion is concerning exclution via the robots.txt file.
Do you advise to keep using this? For example:User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/*Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is.Regards,
Tom Vledder -
Hi Tom
Agree with Martijn that it depends for example, the robots.txt is generally the first port of call for bots as it allows them to understand where you want them to spend their finite time crawling your site. You can aslo give direction to all bots at once or specify a subset. It is generally the best option for blocking pages such as you /cart/ etc were they don't need crawling.
The problem with robots.txt is that it doesn't always keep pages from being indexed especially if there are other external sources linking to the pages in question.
The meta tag noindex on the other hand can be applied to individual pages and you are actually commanding the robots to NOT Index the relevant page in serps, use this option if you have pages you don't want appearing in Google (or other search engines) but the page may still be relevant for authority or able to acquire links (make sure to use Noindex follow) as you still want the robots to crawl the page. Otherwise use Noindex Nofollow hope that this helps.
-
Hi Tom,
It depends, for the /sale/ I would make an exception to make sure that it could be sales pages. But for the other pages I wouldn't want a search engine to waste any crawl budget by looking at these pages for a start. That's why I would go there with a robots.txt implementation instead of META robots as then they'll still visit the page to figure out there they won't index the page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content: using the robots meta tag in conjunction with the canonical tag?
We have a WordPress instance on an Apache subdomain (let's say it's blog.website.com) alongside our main website, which is built in Angular. The tech team is using Akamai to do URL rewrites so that the blog posts appear under the main domain (website.com/more-keywords/here). However, due to the way they configured the WordPress install, they can't do a wildcard redirect under htaccess to force all the subdomain URLs to appear as subdirectories, so as you might have guessed, we're dealing with duplicate content issues. They could in theory do manual 301s for each blog post, but that's laborious and a real hassle given our IT structure (we're a financial services firm, so lots of bureaucracy and regulation). In addition, due to internal limitations (they seem mostly political in nature), a robots.txt file is out of the question. I'm thinking the next best alternative is the combined use of the robots meta tag (no index, follow) alongside the canonical tag to try to point the bot to the subdirectory URLs. I don't think this would be unethical use of either feature, but I'm trying to figure out if the two would conflict in some way? Or maybe there's a better approach with which we're unfamiliar or that we haven't considered?
Technical SEO | | prasadpathapati0 -
Root directory vs. subdirectories
Hello. How much more important does Google consider pages in the root directory relative to pages in a subdirectory? Is it best to keep the most important pages of a site in the root directory? Thanks!
Technical SEO | | nyc-seo0 -
Googlebot does not obey robots.txt disallow
Hi Mozzers! We are trying to get Googlebot to steer away from our internal search results pages by adding a parameter "nocrawl=1" to facet/filter links and then robots.txt disallow all URLs containing that parameter. We implemented this late august and since that, the GWMT message "Googlebot found an extremely high number of URLs on your site", stopped coming. But today we received yet another. The weird thing is that Google gives many of our nowadays robots.txt disallowed URLs as examples of URLs that may cause us problems. What could be the reason? Best regards, Martin
Technical SEO | | TalkInThePark0 -
No follow and do follow on wordpress
I am in the process of building a new wordpress site to replace my old static HTML site and I am now doing SEO on it. With my old site the default was do follow and I could easily change the HTML to no follow but on my wordpress site the default is no follow and I cannot seem to find a way to change parts to do follow. Tried a plugin but it does not deal with the home page at all and when I select do follow my SEO tools still highlight the links as no follow. A bit stumped on this one. Anyone have any experience with this? thanks
Technical SEO | | casper4340 -
How to allow one directory in robots.txt
Hello, is there a way to allow a certain child directory in robots.txt but keep all others blocked? For instance, we've got external links pointing to /user/password/, but we're blocking everything under /user/. And there are too many /user/somethings/ to just block every one BUT /user/password/. I hope that makes sense... Thanks!
Technical SEO | | poolguy0 -
How to do a no follow on site search
We have a site search that is causing a huge amount of errors as the SEOmoz crawler is showing these as duplicate content. Our first thought was to do a no-follow on the site-search directory, but we realized that the site search is /site-search.aspx and URl strings appear at the end for hundreds of pages. How dow we/how can we no-follow an undetermined amount of URL strings?
Technical SEO | | Apptixweb0 -
Rel - canonical vs 301 redirect
I have multiple product pages on my site - what is better for rankings in your experiance? If I 301 the pages to 1 correct version of the product page - or if I rel caanonical to the one correct page?
Technical SEO | | DavidS-2820610 -
What is the sense of robots.txt?
Using robots.txt to prevent search engine from indexing the page is not a good idea. so what is the sense of robots.txt? just for attracting robots to crawl sitemap?
Technical SEO | | jallenyang0