Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best practice for removing indexed internal search pages from Google?
-
Hi Mozzers
I know that it’s best practice to block Google from indexing internal search pages, but what’s best practice when “the damage is done”?
I have a project where a substantial part of our visitors and income lands on an internal search page, because Google has indexed them (about 3 %).
I would like to block Google from indexing the search pages via the meta noindex,follow tag because:
- Google Guidelines: “Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines.” http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35769
- Bad user experience
- The search pages are (probably) stealing rankings from our real landing pages
- Webmaster Notification: “Googlebot found an extremely high number of URLs on your site” with links to our internal search results
I want to use the meta tag to keep the link juice flowing. Do you recommend using the robots.txt instead? If yes, why?
Should we just go dark on the internal search pages, or how shall we proceed with blocking them?
I’m looking forward to your answer!
Edit: Google have currently indexed several million of our internal search pages.
-
Hello,
Sorry for the late answer, I have the same problem and I think I found the solution. For me works this:
1. Add meta tag robots No Index , Follow for the internal search pages and wait for Google remove it from the index.
Be careful if you do **BOTH (**Adding meta tag robots and Disallow in Robots.txt ) Because of this:
Please note that if you do both: block the search engines in robots.txt and via the meta tags, then the robots.txt command is the primary driver, as they may not crawl the page to see the meta tags, so the URL may still appear in the search results listed URL-only. Souce: http://tools.seobook.com/robots-txt/
I hope this information can help you.
-
I would honestly exclude all your internal search pages from the Google index via robots.txt (noindex) exclusion. This will at least re-distribute crawl-time to other areas of your site.
Just having the noindex,follow in the meta-tag (without the robots.txt exclusion) will let GoogleBot crawl the page and then eventually remove it from the index.
I would also change your search-page canoncial to the search term (i.e. /search/iphone) and then have a noindex,follow on meta-tag.
-
It sounds like the meta noindex,follow tag is what you want.
robots.txt will block googlebot from crawling your search pages, but Google can still keep the search pages in its index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best Practices for Title Tags for Product Listing Page
My industry is commercial real estate in New York City. Our site has 300 real estate listings. The format we have been using for Title Tags are below. This probably disastrous from an SEO perspective. Using number is a total waste space. A few questions:
Intermediate & Advanced SEO | | Kingalan1
-Should we set listing not no index if they are not content rich?
-If we do choose to index them, should we avoid titles listing Square Footage and dollar amounts?
-Since local SEO is critical, should the titles always list New York, NY or Manhattan, NY?
-I have red that titles should contain some form of branding. But our company name is Metro Manhattan Office Space. That would take up way too much space. Even "Metro Manhattan" is long. DO we need to use the title tag for branding or can we just focus on a brief description of page content incorporating one important phrase? Our site is: w w w . m e t r o - m a n h a t t a n . c o m <colgroup><col width="405"></colgroup>
| Turnkey Flatiron Tech Space | 2,850 SF $10,687/month | <colgroup><col width="405"></colgroup>
| Gallery, Office Rental | Midtown, W. 57 St | 4441SF $24055/month | <colgroup><col width="405"></colgroup>
| Open Plan Loft |Flatiron, Chelsea | 2414SF $12,874/month | <colgroup><col width="405"></colgroup>
| Tribeca Corner Loft | Varick Street | 2267SF $11,712/month | <colgroup><col width="405"></colgroup>
| 275 Madison, LAW, P7, 3,252SF, $65 - Manhattan, New York |0 -
Should I use noindex or robots to remove pages from the Google index?
I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed. From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else. I can add the noindex tag to the review pages but they wont be crawled. https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?
Intermediate & Advanced SEO | | Tylerj0 -
URL Rewriting Best Practices
Hey Moz! I’m getting ready to implement URL rewrites on my website to improve site structure/URL readability. More specifically I want to: Improve our website structure by removing redundant directories. Replace underscores with dashes and remove file extensions for our URLs. Please see my example below: Old structure: http://www.widgets.com/widgets/commercial-widgets/small_blue_widget.htm New structure: https://www.widgets.com/commercial-widgets/small-blue-widget I've read several URL rewriting guides online, all of which seem to provide similar but overall different methods to do this. I'm looking for what's considered best practices to implement these rewrites. From what I understand, the most common method is to implement rewrites in our .htaccess file using mod_rewrite (which will find the old URLs and rewrite them according to the rewrites I implement). One question I can't seem to find a definitive answer to is when I implement the rewrite to remove file extensions/replace underscores with dashes in our URLs, do the webpage file names need to be edited to the new format? From what I understand the webpage file names must remain the same for the rewrites in the .htaccess to work. However, our internal links (including canonical links) must be changed to the new URL format. Can anyone shed light on this? Also, I'm aware that implementing URL rewriting improperly could negatively affect our SERP rankings. If I redirect our old website directory structure to our new structure using this rewrite, are my bases covered in regards to having the proper 301 redirects in place to not affect our rankings negatively? Please offer any advice/reliable guides to handle this properly. Thanks in advance!
Intermediate & Advanced SEO | | TheDude0 -
No-index pages with duplicate content?
Hello, I have an e-commerce website selling about 20 000 different products. For the most used of those products, I created unique high quality content. The content has been written by a professional player that describes how and why those are useful which is of huge interest to buyers. It would cost too much to write that high quality content for 20 000 different products, but we still have to sell them. Therefore, our idea was to no-index the products that only have the same copy-paste descriptions all other websites have. Do you think it's better to do that or to just let everything indexed normally since we might get search traffic from those pages? Thanks a lot for your help!
Intermediate & Advanced SEO | | EndeR-0 -
Slug best practices?
Hello, my team is trying to understand how to best construct slugs. We understand they need to be concise and easily understandable, but there seem to be vast differences between the three examples below. Are there reasons why one might be better than the others? http://www.washingtonpost.com/news/morning-mix/wp/2014/06/20/bad-boys-yum-yum-violent-criminal-or-not-this-mans-mugshot-is-heating-up-the-web/ http://hollywoodlife.com/2014/06/20/jeremy-meeks-sexy-mug-shot-felon-viral/ http://www.tmz.com/2014/06/19/mugshot-eyes-felon-sexy/
Intermediate & Advanced SEO | | TheaterMania0 -
Limit on Google Removal Tool?
I'm dealing with thousands of duplicate URL's caused by the CMS... So I am using some automation to get through them - What is the daily limit? weekly? monthly? Any ideas?? thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Is 404'ing a page enough to remove it from Google's index?
We set some pages to 404 status about 7 months ago, but they are still showing in Google's index (as 404's). Is there anything else I need to do to remove these?
Intermediate & Advanced SEO | | nicole.healthline0 -
Should I Allow Blog Tag Pages to be Indexed?
I have a wordpress blog with settings currently set so that Google does not index tag pages. Is this a best practice that avoids duplicate content or am I hurting the site by taking eligible pages out of the index?
Intermediate & Advanced SEO | | JSOC0