Skip indexing the search pages
-
Hi,
I want all such search pages skipped from indexing
So i have this in robots.txt (Disallow: /search/)
Now any posts that start with search are being blocked and in Google i see this message
A description for this result is not available because of this site's robots.txt – learn more.
How can i handle this and also how can i find all URL's that Google is blocking from showing
Thanks
-
Sure - you have urls that are being blocked by robots - you have this line in your robots.txt -
Disallow: /questions/search
It is thus preventing urls from within that folder, questions, which start with the word search from being crawled. What are you trying to accomplish with this block? If it's the folder search, within questions, it should be /questions/search/.
And the other warning is telling you these pages take a long time to load - check your server or these individual pages and see why that is taking so long.
-
-
As Saijo said above, the meta robots noindex tag is the way to go. When you block a folder via robots.txt, you prevent Google from visiting and crawling that folder and any content within it. If Google has already crawled the content, they won't remove the content from their index just if you block it with robots.txt. The old version they have of the page will be stored and saved in their index, and they just won't be able to show you an updated snippet of the page due to the robots.txt block.
To remove the pages from the index completely, you can do one of 2 things -
- in webmaster tools, go to the url removal section, and remove that folder from the index - this will only work when it's blocked via robots.txt
- you can add a meta robots noindex tag to the pages/page template, and remove the robots.txt block - you need to remove the robots.txt block so the search engines can recrawl the pages, see the meta robots directive, and follow the noindex guide to remove the page.
In general, I would recommend using the meta robots noindex directive over the robots.txt, because it should work for all search engines, and you won't have to go into webmaster tools for each one. You also will ensure that you don't accidentally block other urls.
From your example above, if you just blocked the folder /search/, a page that includes the word search in the url but isn't in the blocked folder shouldn't be blocked from the search engines because of that line - I would check in webmaster tools the robots.txt section, because it doesn't look to me, based on your robots.txt file, that any url with search in it should be blocked.
Good luck,
Mark
-
I guess i was not clear with my question.
So i have this in robots.txt (Disallow: /search/)
My intension yo place /search/ is to stop Google indexing any of my search posts
Now whats happened is
www.somesite.com/questions/search-the-internet
Posts like above are also being blocked
-
To Block search pages from the index you can try adding the META NOINDEX tag in the head section of the search pages
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing a site from Google index with no index met tags
Hi there! I wanted to remove a duplicated site from the google index. I've read that you can do this by removing the URL from Google Search console and, although I can't find it in Google Search console, Google keeps on showing the site on SERPs. So I wanted to add a "no index" meta tag to the code of the site however I've only found out how to do this for individual pages, can you do the same for a entire site? How can I do it? Thank you for your help in advance! L
Technical SEO | | Chris_Wright1 -
Blog page won't get indexed
Hi Guys, I'm currently asked to work on a website. I noticed that the blog posts won't get indexed in Google. www.domain.com/blog does get indexed but the blogposts itself won't. They have been online for over 2 months now. I found this in the robots.txt file: Allow: / Disallow: /kitchenhandle/ Disallow: /blog/comments/ Disallow: /blog/author/ Disallow: /blog/homepage/feed/ I'm guessing that the last line causes this issue. Does anyone have an idea if this is the case and why they would include this in the robots.txt? Cheers!
Technical SEO | | Happy-SEO2 -
How do I get my pages to go from "Submitted" to "Indexed" in Google Webmaster Tools?
Background: I recently launched a new site and it's performing much better than the old site in terms of bounce rate, page view, pages per session, session duration, and conversions. As suspected, sessions, users, and % new sessions are all down. Which I'm okay with because the the old site had a lot of low quality traffic going to it. The traffic we have now is much more engaged and targeted. Lastly, the site was built using Squarespace and was launched the middle of August. **Question: **When reviewing Google Webmaster Tools' Sitemaps section, I noticed it says 57 web pages Submitted, but only 5 Indexed! The sitemap that's submitted seems to be all there. I'm not sure if this is a Squarespace thing or what. Anyone have any ideas? Thanks!!
Technical SEO | | Nate_D0 -
Google showing https:// page in search results but directing to http:// page
We're a bit confused as to why Google shows a secure page https:// URL in the results for some of our pages. This includes our homepage. But when you click through it isn't taking you to the https:// page, just the normal unsecured page. This isn't happening for all of our results, most of our deeper content results are not showing as https://. I thought this might have something to do with Google conducting searches behind secure pages now, but this problem doesn't seem to affect other sites and our competitors. Any ideas as to why this is happening and how we get around it?
Technical SEO | | amiraicaew0 -
Only my website homepage is appearing in search and the other indvidual pages are not coming up?This happened after the website revamp
We have revamped our website http://www.wsinetpower.com/ after te revamp the SEO rankings went down and the inner pages are not appearing in serach. What could be the reason
Technical SEO | | Muna0 -
Why are pages linked with URL parameters showing up as separate pages with duplicate content?
Only one page exists . . . Yet I link to the page with different URL parameters for tracking purposes and for some reason it is showing up as a separate page with duplicate content . . . Help? rpcIZ.png
Technical SEO | | BlueLinkERP0 -
Should I delete a page or remove links on a penalized page?
Hello All, If I have a internal page that has low quality links point to it or a penality. Can I just remove the page, and start over versus trying to remove the links? Over time wouldn't this page disapear along with the penalty on that page? Kinda like pruning a tree? Cutting off the junk limbs so other could grow stronger, or to start new fresh ones. Example: www.domain.com Penalized Internal Page: (Say this page is penalized due to keyword stuffing, and has low quality links pointing to it like blog comments, or profiles) www.domain.com/penalized-internal-page.com Would it be effective to just delete this page (www.domain.com/penalized-internal-page.com) and start over with a new page. New Internal Page: www.domain.com/new-internal-page.com I would of course lose any good links point to that page, but it might be easier then trying to remove old back links. Thoughts? Thanks! Pete
Technical SEO | | Juratovic0 -
Secondary Pages Indexed over Primary Page
I have 4 pages for a single product Each of the pages link to the Main page for that product Google is indexing the secondary pages above my preferred landing page How do I fix this?
Technical SEO | | Bucky0