Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best practice for removing indexed internal search pages from Google?
-
Hi Mozzers
I know that it’s best practice to block Google from indexing internal search pages, but what’s best practice when “the damage is done”?
I have a project where a substantial part of our visitors and income lands on an internal search page, because Google has indexed them (about 3 %).
I would like to block Google from indexing the search pages via the meta noindex,follow tag because:
- Google Guidelines: “Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines.” http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35769
- Bad user experience
- The search pages are (probably) stealing rankings from our real landing pages
- Webmaster Notification: “Googlebot found an extremely high number of URLs on your site” with links to our internal search results
I want to use the meta tag to keep the link juice flowing. Do you recommend using the robots.txt instead? If yes, why?
Should we just go dark on the internal search pages, or how shall we proceed with blocking them?
I’m looking forward to your answer!
Edit: Google have currently indexed several million of our internal search pages.
-
Hello,
Sorry for the late answer, I have the same problem and I think I found the solution. For me works this:
1. Add meta tag robots No Index , Follow for the internal search pages and wait for Google remove it from the index.
Be careful if you do **BOTH (**Adding meta tag robots and Disallow in Robots.txt ) Because of this:
Please note that if you do both: block the search engines in robots.txt and via the meta tags, then the robots.txt command is the primary driver, as they may not crawl the page to see the meta tags, so the URL may still appear in the search results listed URL-only. Souce: http://tools.seobook.com/robots-txt/
I hope this information can help you.
-
I would honestly exclude all your internal search pages from the Google index via robots.txt (noindex) exclusion. This will at least re-distribute crawl-time to other areas of your site.
Just having the noindex,follow in the meta-tag (without the robots.txt exclusion) will let GoogleBot crawl the page and then eventually remove it from the index.
I would also change your search-page canoncial to the search term (i.e. /search/iphone) and then have a noindex,follow on meta-tag.
-
It sounds like the meta noindex,follow tag is what you want.
robots.txt will block googlebot from crawling your search pages, but Google can still keep the search pages in its index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Multiple Markups on The Same Page - Best Solution?
Hi there! I have a website that is build in react javascript, and I'm trying to use markup on my pages. They are mostly articles about general topics with common questions (about the topic), and for most articles I would like to use two markups: article markup + FAQ Markup ( for the questions in the article) article markup + how-to markup Can I do this or will Google get confused? Since I have two @type at the same time, for example @type": "FAQPage" and "@type": "Article". How should I think? I'm using https://schema.dev/ right now. Thanks!
Intermediate & Advanced SEO | | Leowa0 -
My url disappeared from Google but Search Console shows indexed. This url has been indexed for more than a year. Please help!
Super weird problem that I can't solve for last 5 hours. One of my urls: https://www.dcacar.com/lax-car-service.html Has been indexed for more than a year and also has an AMP version, few hours ago I realized that it had disappeared from serps. We were ranking on page 1 for several key terms. When I perform a search "site:dcacar.com " the url is no where to be found on all 5 pages. But when I check my Google Console it shows as indexed I requested to index again but nothing changed. All other 50 or so urls are not effected at all, this is the only url that has gone missing can someone solve this mystery for me please. Thanks a lot in advance.
Intermediate & Advanced SEO | | Davit19850 -
My product category pages are not being indexed on google can someone help?
My website has been indexed on google and all of its pages can be found on google except for the product category pages - which are where we want our traffic heading to, so this is a big problem for us. Our website is www.skirtinguk.com And an example of a page that isn't being indexed is https://www.skirtinguk.com/product-category/mdf-skirting-board/
Intermediate & Advanced SEO | | chelseaskirtinguk0 -
Image URLs - best practice
Hi - I'm assuming image URL best practice follows same principles as non image URLs (not too many files and so on) - I notice alot of web devs putting photos in subdomains, so wonder if I'm missing something (I usually avoid subdomains like the plague)!
Intermediate & Advanced SEO | | McTaggart1 -
No-index pages with duplicate content?
Hello, I have an e-commerce website selling about 20 000 different products. For the most used of those products, I created unique high quality content. The content has been written by a professional player that describes how and why those are useful which is of huge interest to buyers. It would cost too much to write that high quality content for 20 000 different products, but we still have to sell them. Therefore, our idea was to no-index the products that only have the same copy-paste descriptions all other websites have. Do you think it's better to do that or to just let everything indexed normally since we might get search traffic from those pages? Thanks a lot for your help!
Intermediate & Advanced SEO | | EndeR-0 -
How long takes to a page show up in Google results after removing noindex from a page?
Hi folks, A client of mine created a new page and used meta robots noindex to not show the page while they are not ready to launch it. The problem is that somehow Google "crawled" the page and now, after removing the meta robots noindex, the page does not show up in the results. We've tried to crawl it using Fetch as Googlebot, and then submit it using the button that appears. We've included the page in sitemap.xml and also used the old Google submit new page URL https://www.google.com/webmasters/tools/submit-url Does anyone know how long will it take for Google to show the page AFTER removing meta robots noindex from the page? Any reliable references of the statement? I did not find any Google video/post about this. I know that in some days it will appear but I'd like to have a good reference for the future. Thanks.
Intermediate & Advanced SEO | | fabioricotta-840380 -
Google Indexing Feedburner Links???
I just noticed that for lots of the articles on my website, there are two results in Google's index. For instance: http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html and http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+thewebhostinghero+(TheWebHostingHero.com) Now my Feedburner feed is set to "noindex" and it's always been that way. The canonical tag on the webpage is set to: rel='canonical' href='http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html' /> The robots tag is set to: name="robots" content="index,follow,noodp" /> I found out that there are scrapper sites that are linking to my content using the Feedburner link. So should the robots tag be set to "noindex" when the requested URL is different from the canonical URL? If so, is there an easy way to do this in Wordpress?
Intermediate & Advanced SEO | | sbrault740 -
Are there any negative effects to using a 301 redirect from a page to another internal page?
For example, from http://www.dog.com/toys to http://www.dog.com/chew-toys. In my situation, the main purpose of the 301 redirect is to replace the page with a new internal page that has a better optimized URL. This will be executed across multiple pages (about 20). None of these pages hold any search rankings but do carry a decent amount of page authority.
Intermediate & Advanced SEO | | Visually0