When You Add a Robots.txt file to a website to block certain URLs, do they disappear from Google's index?
-
I have seen several websites recently that have have far too many webpages indexed by Google, because for each blog post they publish, Google might index the following:
- www.mywebsite.com/blog/title-of-post
- www.mywebsite.com/blog/tag/tag1
- www.mywebsite.com/blog/tag/tag2
- www.mywebsite.com/blog/category/categoryA
- etc
My question is: if you add a robots.txt file that tells Google NOT to index pages in the "tag" and "category" folder, does that mean that the previously indexed pages will eventually disappear from Google's index? Or does it just mean that newly created pages won't get added to the index? Or does it mean nothing at all? thanks for any insight!
-
Hi William
If the pages in question are
- already indexed by Google then if you block them via the robots.txt , they will show up in search result but the meta description will say something along the lines of
A description for this result is not available because of this site's robots.txt – learn more.
2) not indexed by Google for example on a new site , they don't follow it and the pages does not come up in search directly BUT if some external sites link to the pages then they can still come up in the SERP some time down the track.
Your best bet to keep the page out of the public SERP index is the meta robots tag : http://www.robotstxt.org/meta.html
-
William, If the pages in question are linked to from external resources the robots.txt file will not prevent the pages from appearing in the index. Per Moz's Robots.txt and Meta Robots best practices, "the robots.txt tells the engines not to crawl the given URL, but that they may keep the page in the index and display it in in results.
To prevent all robots from indexing a page on your site, place the following meta tag into the section of your page:
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best practice to have gated white paper indexed by Google
Our main website white paper page has an image and brief description of the white paper. Once you click the white paper you are redirected to a form to access the gated white paper. Once you complete that form you are redirected to the white paper pdf which is housed on a subdomain/Hubspot. Because of this, I do not believe our website is getting "credit" for the keywords/content on these pages. Any suggestions on how we can allow the search engines to crawl this content while still keeping it gated? As I understand it a sub domain cannot hep or hurt (aside from critical crawler issues) the main domain. Thank you
On-Page Optimization | | NikCall0 -
'Results pages'
Hey there, I have a website that my client is update it every day with some 'results' for example - see attached image.
On-Page Optimization | | JohnPalmer
What is the best way to avoid 7 duplicate content pages every day? MgYlqFW.png0 -
Strange SERP's descriptions
Hey, when I googled one of our products i came up with this strange result, see attachment. I searched for: kurs praktische psychologie on google germany. These words also come up in the meta description of this page:** Praktische Psychologie** Fernkurs mit professioneller Betreuung. Testen Sie den praxisorientierten Kurs über die Grundlagen der Psychologie 4 Wochen kostenlos. and in the body: _Sie glauben der Mensch lässt sich trotz all seiner Facetten durchschauen, wenn man sich nur Mühe gibt ihn zu verstehen? Da liegen Sie vollkommen richtig! Der Kurs "Praktische Psychologie" vermittelt Ihnen hierfür alle Kenntnisse und Fähigkeiten, sodass Sie schon bald das Mysterium Mensch ergründen. _ Why is Google still showing this description which i obviously don't want to be shown, and why does it state _spring naar (jump to) Kursgeburh _and how can i avoid this? yd1DStW
On-Page Optimization | | NHA_DistanceLearning0 -
Over-Optimized Website
I'm looking for advice for what you would start with if you were working on a website that was extremely over-optimized for 1 keyword. So, for example, I'm going to pretend this client is a dog trainer in Toronto (I can't publicly post the URL). I've read places that having exact-match anchor text links to inner pages in the footer of the site can cause problems and removing it has resulted in big ranking jumps. I'm looking to see if there are other big items that you would tackle first if this was your client. Some examples of things the site has: There is a page for dog training under their Services menu. However, internal links on their site link "dog training" to both the homepage and to this service page. Is that going to cause issues? The anchor text for internal linking is almost always the exact same word - "Dog Training". There is a banner that goes across the top of the site that appears on every page that says "Dog Training Toronto". I'm guessing I should remove that. Would the same keyword being overly used on every page cause confusion? Almost every image on his site is saved in the format "Dog Training Toronto". I'm looking to see if anyone has general tips on where to start with a site that has been over-optimized for 1 keyword. He actually has a ton of good content on his blog that gets a ton of traffic (because it's actually useful) so it's not that his content sucks - it's just been overly structured and SEO'd to death. I found a few articles on this but other than the footer advice I didn't find too many case studies of others that have run into this issue and done a few steps that actually worked.
On-Page Optimization | | ImprezzioMarketing0 -
Description tag not showing in the SERPs because page is blocked by Robots, but the page isn't blocked. Any help?
While checking some SERP results for a few pages of a site this morning I noticed that some pages were returning this message instead of a description tag, A description for this result is not avaliable because of this site's robot.s.txt The odd thing is the page isn't blocked in the Robots.txt. The page is using Yoast SEO Plugin to populate meta data though. Anyone else had this happen and have a fix?
On-Page Optimization | | mac22330 -
Indexation problem
Hello, I have an online store specialized in offers and discounts (http://www.offertazo.com/) with an indexation problem. The products are not updated correctly. I think the problem is that when I publish a new offer, it doesn´t appear on the top of my page´s SERP. I would appreciate any suggestions. Best regwards.
On-Page Optimization | | ofuente0 -
No Data Available for this URL
Hi,
On-Page Optimization | | ostiguyj
I really don't understand why I have this message "No data available for this URL"
in my SEOMOZ campain. (www.bienchezsoi.ca) When I look at my page rank, I get a score of 0 I have no idea of to fix it. Please help. Thanks0 -
Duplicate product urls
Our site automatically creates shorter urls for the products. There is a rel canonical tag in place, but webmaster tools shows these urls have duplicate title tags. Here is an example: http://www.colemanfurniture.com/holden-desk.htm http://www.colemanfurniture.com/writing-desks-secretary-desks/holden-desk.htm Should the longer url be redirected to the shorter one?
On-Page Optimization | | thappe0