Best practices for robotx.txt -- allow one page but not the others?
-
So, we have a page, like domain.com/searchhere, but results are being crawled (and shouldn't be), results look like domain.com/searchhere?query1. If I block /searchhere? will it block users from crawling the single page /searchere (because I still want that page to be indexed).
What is the recommended best practice for this?
-
SEOmoz used to use Google Search for the site. I am confident Google has a solid method for keeping their own results clean.
It appears SEOmoz recently changed their search widget. If you examine the URL you shared, notice none of the search results actually appear in the HTML of the page. For example, load the view-source URL and perform a find (CTRL+F) for "testing" which is the subject of the search. There are no results. Since the results are not in the page's HTML, they would not get indexed.
-
If Google is viewing the search result pages as soft 404s, then yes, adding the noindex tag should resolve the problem.
-
And, because google can currently crawl these search result pages, there are a number of soft 404 pages popping up. Would adding a noindex tag to these pages fix the issue?
-
Thanks for the links and help.
How does seomoz keep search results from being indexed? They don't block search results with robots.txt and it doesn't appear that they add the noindex tag to the search result pages.(ex: view-source:http://www.seomoz.org/pages/search_results#stq=testing&stp=1)
-
Yeah, but Ryan's answer is the best one if you can go that route.
-
Hi Michelle,
The concept of crawl efficiency is highly misunderstood. Are all your site's pages being indexed? Is new content or changes indexed in a timely manner? If so, that would indicate your site is being crawled efficiently.
Regarding the link you shared, you are on the right track but need to dig a bit deeper. On the page you shared, find the discussion related to robots.txt. There is a link which will lead you to the following page:
https://developers.google.com/webmasters/control-crawl-index/docs/faq#h01
There you will find a more detailed explanation along with several examples of when not to use robots.txt.
robots.txt: Use it if crawling of your content is causing issues on your server. For example, you may want to disallow crawling of infinite calendar scripts. You should not use the robots.txt to block private content (use server-side authentication instead), or handle canonicalization (see our Help Center). If you must be certain that a URL is not indexed, use the robots meta tag or X-Robots-Tag HTTP header instead.
SEOmoz offers a great guide on this topic as well: http://www.seomoz.org/learn-seo/robotstxt
If you desire to go beyond the basic Google and SEOmoz explanation and learn more about this topic, my favorite article related to robots.txt, written by Lindsay, can be found here: http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions
-
-
Hi Ryan,
Wouldn't that cause issues with crawl efficiency?
Also, webmaster guidelines say "Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines."
-
Thank you. Are you sure about that?
-
what about if you use "<a title="Click for Help!">Canonical URL" tag ?</a>
You can put this code:
in
/searchhere?page.
-
The best practice would be to add the noindex tag to the search result pages but not the /searchhere page.
Typically speaking, the best robots.txt file is a blank one. The file should only be used as a last resort with respect to blocking content.
-
What you outlined sounds to me like it should work. Disallowing /searchhere? shouldn't disallow the top-level search page at /searchhere, but should disallow all the search result pages with queries after the ?.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 Redirect to Home Page or Sub-Page?
What do you think about 301 redirect of good expired domain to a sub-page instead of the home page? I'm doing this so I don't hurt my brand name. Let me know your thoughts please. Thank you
Intermediate & Advanced SEO | | JuanWork0 -
Is it good practice of keeping all our pages at second level?
While defining the site structure we thought of having all pages at second level only. i.e. domain.com/services domain.com/city domain.com/services-in-city please let us know the pros and cons of having this as architecture.
Intermediate & Advanced SEO | | fabogo_marketing0 -
When does it make sense to make a meta description longer than what's considered best practice?
I've seen all the length recommendations and understand the reasoning is that they will be cut off when you search the time but I've also noticed that Google will "move" the meta description if the search term that the user is using is in the cached version of the page. S I have a case where Google is indexing the pages but not caching the content (at least not yet). So we see the meta description just fine on the Google results but we can't see the content cache when checking the Google cached version. **My question is: **In this case, why would it be a bad idea to make a slightly lengthier (but still relevant) meta description with the intent that one of the terms in that description could match the user's search terms and the description would "move" to highlight that term in the results.
Intermediate & Advanced SEO | | navidash0 -
Should we show(to google) different city pages on our website which look like home page as one page or different? If yes then how?
On our website, we show events from different cities. We have made different URL's for each city like www.townscript.com/mumbai, www.townscript.com/delhi. But the page of all the cities looks similar, only the events change on those different city pages. Even our home URL www.townscript.com, shows the visitor the city which he visited last time on our website(initially we show everyone Mumbai, visitor needs to choose his city then) For every page visit, we save the last visited page of a particular IP address and next time when he visits our website www.townscript.com, we show him that city only which he visited last time. Now, we feel as the content of home page, and city pages is similar. Should we show these pages as one page i.e. Townscript.com to Google? Can we do that by rel="canonical" ? Please help me! As I think all of these pages are competing with each other.
Intermediate & Advanced SEO | | sanchitmalik0 -
How to associate content on one page to another page
Hi all, I would like associate content on "Page A" with "Page B". The content is not the same, but we want to tell Google it should be associated. Is there an easy way to do this?
Intermediate & Advanced SEO | | Viewpoints1 -
Google is displaying my pages path instead of URLS (Pages name)
Does anyone knows why Google is displaying my pages path instead of the URL in the search results, i discoverd that while am searching using a keyword of mine then i copied the link http://www.smarttouch.me/services-saudi/web-services/web-design and found all related results are the same, could anyone one tell me why is that and is it really differs? or the URL display is more important than the Path display for SEO!
Intermediate & Advanced SEO | | ali8810 -
Is it possible for a multi doctor practice to have the practice's picture displayed in Google's SERP?
Google now includes pictures of authors in the results of the pages. Therefore, a single practice doctor can include her picture into Google's SERP (http://markup.io/v/dqpyajgz7jkd). How can a multi doctor practice display the practice's picture as opposed to a single doctor? A search for Plastic Surgery Chicago displayed this (query: plastic surgery Chicago) http://markup.io/v/bx3f28ynh4w5. I found one example of a search result showing a picture of both doctors for a multi doctor practice (query: houston texas plastic surgeon). http://markup.io/v/t20gfazxfa6h
Intermediate & Advanced SEO | | CakeWebsites0 -
High number of items per page or low number with more category pages?
In SEO terms, what would be the best method: High number of items per page or low number with more pages? For example, this category listing here: http://flyawaysimulation.com/downloads/90/fsx-civil-aircraft/ It has 10 items per page. Would there be any benefit of changing a listing like that to 20 items in order to decrease the number of pages in the category? Also, what other ways could you increase the SEO of category listings like that?
Intermediate & Advanced SEO | | Peter2640