Noindex search pages?

YairSpolter

Is it best to noindex search results pages, exclude them using robots.txt, or both?

DougRoberts

I think you're possibly trying to solve a problem that you don't have!

As long as you've got a good information architecture and submitting a dynamically updated sitemap then I don't think you need to worry about this. If you're got a blog, then sharing those on Google+ can be a good way to get them quickly indexed.

YairSpolter

Our search results are not appearing in Google's index and we are not having any issues with getting our content discovered, so I really don't mind disallowing search pages and noindexing them. I was just wondering what advantage there is to disallowing and what I would lose if I only noindex. Isn't it better to allow many avenues of content discovery for the bots?

DougRoberts

Don't worry. I'm not saying that in your case it'll be a "spider trap". Where I have seen it cause problems was on a site search result page that included a "related searches" and a bunch of technical issues.

Are your search results appearing in Google's index?

If you have a valid reason for allowing spiders to crawl this content then yes. you'll want to just noindex them. Personally I would challenge why you want to do this - is there a bigger problem trying to get search engines to discover new content on your site?

YairSpolter

Thanks for the response, Doug.

The truth is that it's unlikely that the spiders will find the search results, but if they do why should I consider it a "spider trap"? Even though I don't want the search results pages indexed, I do want the spiders crawling this content. That's why I'm wondering if it's better to just noindex and not disallow in robots.txt?

DougRoberts

Using the noindex directive will (should) prevent search engines from including the content in their search results - which is good but it still means that the search engines are crawling this content. I've seen one (unlikely) instance where trying to crawl search pages created a bit of a spider trap[, wasting "crawl budget".

So the simplest approach is usually to use the robots.txt to disallow access to the search pages.

If you've got search results in the index already, then you'll want to think about continuing to let Google crawl the pages for a while and using the noindex to help get them de-indexed.

Once this has been done, then you can disallow the site search results in your robots.txt.

Another thing to consider is how the search spiders are finding your search results in the first place...

Er_Maqui

I think it's better to use the robots. With that, you doesn't have problem if someone links to your page.

For better security you can add a meta for this question.

But, as always, it's the spider option to relay on robots, links or metas. If your page it's private, make it private really and put it below a validation system. If you doesn't do it, some "bad" spiders can read and cache your content.

GPainter

No index and blocking robots pretty much do the same thing but you shouldn't only do this if you don't want pages to be not indexed, for more secure areas of the site I would block robots too.

If its to avoid duplicate content don't forget you can use the rel=canonical tag.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Noindex search pages?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Which is the best option for these pages?

Will Reducing Number of Low Page Authority Page Increase Domain Authority?

How do we decide which pages to index/de-index? Help for a 250k page site

ECommerce search results to noindex?

Indexing of internal search results: canonicalization or noindex?

Tips for improving this page

Still Going Down In Search

What to do with WordPress generated pages?