Does using robots.txt to block pages decrease search traffic?
-
I know you can use robots.txt to tell search engines not to spend their resources crawling certain pages.
So, if you have a section of your website that is good content, but is never updated, and you want the search engines to index new content faster, would it work to block the good, un-changed content with robots.txt? Would this content loose any search traffic if it were blocked by robots.txt? Does anyone have any available case studies?
-
If you block the pages from being crawled, you are also telling the search engines to not index the pages (they don't want to include something they haven't looked at). So yes, the traffic numbers from organic search will change if you block the pages in robots.txt.
-
Agreed, that is a better solution, but, I am still wondering if you block something with robots.txt, will that lead to a decrease in traffic? What if we have some duplicate content that is highly trafficked, if we block it with robots.txt, will the traffic numbers change?
-
You certainly don't want to block this content!
One thing I'd consider is the if-modified-since header, or other headers. Here are two articles that explain more about the concept of using headers to tell the search engines " this hasn't changed, don't bother crawling it". I haven't personally used this, but have read about it in many places.
http://www.feedthebot.com/ifmodified.html
http://searchengineland.com/how-to-improve-crawl-efficiency-with-cache-control-headers-88824
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to compete against search terms that use geo-modifiers?
I should start by saying we are new to SEO. We are introducing new “cycling tours” in new destinations and we are looking for a strategy to combat geo-modified keyword searches. When people search for “cycling tours” they will anchor their search with a geo-modifier such as “cycling tours France” or “cycling tours Italy”. Based in Australia we are keen to communicate to Australians searching for international cycling tours there are new Australian options that they may wish to consider. The geo-modifiers required to find our tours (“eyre peninsula” and “carnarvon gorge”) are currently not on the cycling communities radar. For example to find one of our new tours you need to use “cycling tours eyre peninsula” or “cycling tours carnarvon gorge”. Currently the only solution we have found to let people know about our new tours is by word of mouth. Is there an SEO solution?
Intermediate & Advanced SEO | | Chook10 -
301 redirect to search results page?
Hi - we just launched our redesigned website. On the previous site, we had multiple .html pages that contained links to supporting pdf documentation. On this new site, we no longer have those .html landing pages containing the links. The question came up, should we do a search on our site to gather a single link that contains all pdf links from the previous site, and set up a redirect? It's my understanding that you wouldn't want google to index a search results page on your website. Example: old site had the link http://www.oldsite.com/technical-documents.html new site, to see those same links would be like: http://www.newsite.com/resources/search?View+Results=&f[]=categories%3A196
Intermediate & Advanced SEO | | Jenny10 -
Should I disallow via robots.txt for my sub folder country TLD's?
Hello, My website is in default English and Spanish as a sub folder TLD. Because of my Joomla platform, Google is listing hundreds of soft 404 links of French, Chinese, German etc. sub TLD's. Again, i never created these country sub folder url's, but Google is crawling them. Is it best to just "Disallow" these sub folder TLD's like the example below, then "mark as fixed" in my crawl errors section in Google Webmaster tools?: User-agent: * Disallow: /de/ Disallow: /fr/ Disallow: /cn/ Thank you, Shawn
Intermediate & Advanced SEO | | Shawn1240 -
To index or de-index internal search results pages?
Hi there. My client uses a CMS/E-Commerce platform that is automatically set up to index every single internal search results page on search engines. This was supposedly built as an "SEO Friendly" feature in the sense that it creates hundreds of new indexed pages to send to search engines that reflect various terminology used by existing visitors of the site. In many cases, these pages have proven to outperform our optimized static pages, but there are multiple issues with them: The CMS does not allow us to add any static content to these pages, including titles, headers, metas, or copy on the page The query typed in by the site visitor always becomes part of the Title tag / Meta description on Google. If the customer's internal search query contains any less than ideal terminology that we wouldn't want other users to see, their phrasing is out there for the whole world to see, causing lots and lots of ugly terminology floating around on Google that we can't affect. I am scared to do a blanket de-indexation of all /search/ results pages because we would lose the majority of our rankings and traffic in the short term, while trying to improve the ranks of our optimized static pages. The ideal is to really move up our static pages in Google's index, and when their performance is strong enough, to de-index all of the internal search results pages - but for some reason Google keeps choosing the internal search results page as the "better" page to rank for our targeted keywords. Can anyone advise? Has anyone been in a similar situation? Thanks!
Intermediate & Advanced SEO | | FPD_NYC0 -
Keyword Research: How best to target keywords without using a region as part of the search query.
When doing keyword research and trying to rank for a keyword. I am wondering if we need to localize the query by adding a city to it. For example Phoenix Web Design vs. just targeting web design since Google is localizing search results now. Then when creating content and optimizing the site do we just put the keyword in the title and page content or do we also add the region/city to the keyword phrase? Any insight would be appreciated.
Intermediate & Advanced SEO | | hireawizseo0 -
If i disallow unfriendly URL via robots.txt, will its friendly counterpart still be indexed?
Our not-so-lovely CMS loves to render pages regardless of the URL structure, just as long as the page name itself is correct. For example, it will render the following as the same page: example.com/123.html example.com/dumb/123.html example.com/really/dumb/duplicative/URL/123.html To help combat this, we are creating mod rewrites with friendly urls, so all of the above would simply render as example.com/123 I understand robots.txt respects the wildcard (*), so I was considering adding this to our robots.txt: Disallow: */123.html If I move forward, will this block all of the potential permutations of the directories preceding 123.html yet not block our friendly example.com/123? Oh, and yes, we do use the canonical tag religiously - we're just mucking with the robots.txt as an added safety net.
Intermediate & Advanced SEO | | mrwestern0 -
Block search bots on staging server
I want to block bots from all of our client sites on our staging server. Since robots.txt files can easily be copied over when moving a site to production, how can i block bots/crawlers from our staging server (at the server level), but still allow our clients to see/preview their site before launch?
Intermediate & Advanced SEO | | BlueView13010 -
Different pages ranking for search terms, often irrelevant.
Website: www.templatemonster.com
Intermediate & Advanced SEO | | templatemonster
Problem: Positions dropped while pages which were ranking previously disappeared from top 100 and now different - often completely irrelevant - pages are ranking. Examples:
Search term: Joomla Templates
Previous Position: 8
Current Position: 35
Previously Ranked Page: http://www.templatemonster.com/joomla-templates.php
Currently Ranked Page: http://www.templatemonster.com/logo-templates.php Similar situation with the following search terms: virtuemart templates, virtuemart themes, prestashop templates, prestashop themes, magento themes, zencart templates, zencart themes, zen cart templates, zen cart themes When: according to the Google Analytics (drop in visitors stats) this happened on July, 2nd Preconditions: we had 45 minutes downtime on July 2-nd - but could this 45 mins have had such disastrous results?
No redirects or canonical URL were used which could lead to such change of ranking page.
No changes in the site's informational structure and design.
In webmaster tools (inbound links report) we saw a website yesterday which had over 800,000 links pointing to our domain - http://moviebestwatch.com/ - and today this site is NOT found in Webmaster Tools report! Also, site is down, domain is quite new (how could it have possibly developed 800,000 pages in such a short time?) and whois is privacy protected. Is this some dirty trick from competitors - could it have possibly influenced our positions? Still, what I completely fail to understand - how could a page like http://www.templatemonster.com/logo-templates.php be the top ranking page for 'Joomla templates' if there is: not a single mention of the word 'Joomla' on the page (or source code), i.e. the page is completely irrelevant to the search term not a single link with 'Joomla templates' anchor text pointing to that page, neither external nor internal PS. No similar changes in other search engines noticed. Also, the pages in question have been re-spidered July 4th and cache shows the right pages, i.e. it is not that Googlebot has seen logotypes page instead of Joomla templates page. I checked any possible reason I could think of (see "Preconditions") but still have no clue - what is going on?1