Best practice for removing indexed internal search pages from Google?
-
Hi Mozzers
I know that it’s best practice to block Google from indexing internal search pages, but what’s best practice when “the damage is done”?
I have a project where a substantial part of our visitors and income lands on an internal search page, because Google has indexed them (about 3 %).
I would like to block Google from indexing the search pages via the meta noindex,follow tag because:
- Google Guidelines: “Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines.” http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35769
- Bad user experience
- The search pages are (probably) stealing rankings from our real landing pages
- Webmaster Notification: “Googlebot found an extremely high number of URLs on your site” with links to our internal search results
I want to use the meta tag to keep the link juice flowing. Do you recommend using the robots.txt instead? If yes, why?
Should we just go dark on the internal search pages, or how shall we proceed with blocking them?
I’m looking forward to your answer!
Edit: Google have currently indexed several million of our internal search pages.
-
Hello,
Sorry for the late answer, I have the same problem and I think I found the solution. For me works this:
1. Add meta tag robots No Index , Follow for the internal search pages and wait for Google remove it from the index.
Be careful if you do **BOTH (**Adding meta tag robots and Disallow in Robots.txt ) Because of this:
Please note that if you do both: block the search engines in robots.txt and via the meta tags, then the robots.txt command is the primary driver, as they may not crawl the page to see the meta tags, so the URL may still appear in the search results listed URL-only. Souce: http://tools.seobook.com/robots-txt/
I hope this information can help you.
-
I would honestly exclude all your internal search pages from the Google index via robots.txt (noindex) exclusion. This will at least re-distribute crawl-time to other areas of your site.
Just having the noindex,follow in the meta-tag (without the robots.txt exclusion) will let GoogleBot crawl the page and then eventually remove it from the index.
I would also change your search-page canoncial to the search term (i.e. /search/iphone) and then have a noindex,follow on meta-tag.
-
It sounds like the meta noindex,follow tag is what you want.
robots.txt will block googlebot from crawling your search pages, but Google can still keep the search pages in its index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Indexing our site
We have 700 city pages on our site. We submitted to google via a https://www.samhillbands.com/sitemaps/locations.xml but they only indexed 15 so far. Yes the content is similar on all of the pages...thought on getting them to index the remaining pages?
Intermediate & Advanced SEO | | brianvest0 -
Some sitemap xml apprears in google search
some sitemap, i have observed, that google is showing in the result for our website.. wht is wrong? any idea?
Intermediate & Advanced SEO | | Rahim1190 -
What are the best practices for microdata?
Not too long ago, Dublin Core was all the rage. Then Open Graph data exploded, and Schema seems to be highly regarded. In a best-case scenario, on a site that's already got the basics like good content, clean URLs, rich and useful page titles and meta descriptions, well-named and alt-tagged images and document outlines, what are today's best practices for microdata? Should Open Graph information be added? Should the old Dublin Core be resurrected? I'm trying to find a way to keep markup light and minimal, but include enough microdata for crawlers to get a better sense of the content and its relationships to other subdomains and sites.
Intermediate & Advanced SEO | | WebElaine0 -
Google is indexing the wrong pages
I have been having problems with Google indexing my website since mid May. I haven't made any changes to my website which is wordpress. I have a page with the title 'Peterborough Cathedral wedding', I search Google for 'wedding Peteborough Cathedral', this is not a competitive search phrase and I'd expect to find my blog post on page one. Instead, half way down page 4 I find Google has indexed www.weddingphotojournalist.co.uk/blog with the title 'wedding photojournalist | Portfolio', what google has indexed is a link to the blog post and not the blog post itself. I repeated this for several other blog posts and keywords and found similar results, most of which don't make any sense at all - A search for 'Menorca wedding photography' used to bring up one of my posts at the top of page one. Now it brings up a post titled 'La Mare wedding photography Jersey" which happens to have a link to the Menorca post at the bottom of the page. A search for 'Broadoaks country house weddng photography' brings up 'weddingphotojournalist | portfolio' which has a link to the Broadoaks post. a search for 'Blake Hall wedding photography' does exactly the same. In this case Google is linking to www.weddingphotojournalist.blog again, this is a page of recent blog posts. Could this be a problem with my sitemap? Or the Yoast SEO plugin? or a problem with my wordpress theme? Or is Google just a bit confused?
Intermediate & Advanced SEO | | weddingphotojournalist0 -
Indexing of internal search results: canonicalization or noindex?
Hi Mozzers, First time poster here, enjoying the site and the tools very much. I'm doing SEO for a fairly big ecommerce brand and an issue regarding internal search results has come up. www.example.com/electronics/iphone/5s/ gives an overview of the the model-specific listings. For certain models there are also color listings, but these are not incorporated in the URL structure. Here's what Rand has to say in Inbound Marketing & SEO: Insights From The Moz Blog Search filters are used to narrow an internal search—it could be price, color, features, etc.
Intermediate & Advanced SEO | | ClassicDriver
Filters are very common on e-commerce sites that sell a wide variety of products. Search filter
URLs look a lot like search sorts, in many cases:
www.example.com/search.php?category=laptop
www.example.com/search.php?category=laptop?price=1000
The solution here is similar to the preceding one—don’t index the filters. As long as Google
has a clear path to products, indexing every variant usually causes more harm than good. I believe using a noindex tag is meant here. Let's say you want to point users to an overview of listings for black 5s iphones. The URL is an internal search filter which looks as follows: www.example.com/electronics/apple/iphone/5s?search=black Which you wish to link with the anchor text "black iphone 5s". Correct me if I'm wrong, but if you no-index the black 5s search filters, you lose the equity passed through the link. Whereas if you canonicalize /electronics/apple/iphone/5s you would still leverage the link juice and help you rank for "black iphone 5s". Doesn't it then make more sense to use canonicalization?0 -
Why is this page not being delivered for Google search result?
Hey folks, Figured I would try to get an experts insight on this. On google search result for BLACK TITANIUM RINGS + TITANIUM-JEWELRY.COM the page that I "think" should show up is this one: http://www.titanium-jewelry.com/black-titanium-rings.html However, it does not. Imho, this page is highly relevant. I used Rank Tracker here on seomoz.org and the page is not even in top 50 of search engine results for google. Our 'About Black Titanium Rings' page ranks #2 (http://www.titanium-jewelry.com/about-black-titanium.html) but the /black-titanium-rings.html page doesn't even rank. Any suggestions on what I could look at to figure out why this page is being penalized? We are not under a manual penalty (anymore!). Thanks! Ron
Intermediate & Advanced SEO | | yatesandcojewelers0 -
Google Generating its Own Page Titles
Hi There I have a question regarding Google generating its own page titles for some of the pages on my website. I know that Google sometimes takes your H1 tag and uses it as a page title, however, can anyone tell me how I can stop this from happening? Is there a meta tag I can use, for example like the NOODP tag? Or do I have to change my page title? Thanks Sadie
Intermediate & Advanced SEO | | dancape0 -
To index search results or not?
In its webmaster guidelines, Google says not to index search results " that don't add much value for users coming from search engines." I've noticed several big brands index search results, and am wondering if it is generally OK to index search results with high engagement metrics (high PVPV, time on site, etc). We have an database of content, and it seems one of the best ways to get this content in search engines would be to allow indexing of search results (to capture the long tail) rather than build thousands of static URLs. Have any smaller brands had success with allowing indexing of search results? Any best practices or recommendations?
Intermediate & Advanced SEO | | nicole.healthline0