Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How do i block an entire category/directory with robots.txt?
-
Anyone has any idea how to block an entire product category, including all the products in that category using the robots.txt file? I'm using woocommerce in wordpress and i'd like to prevent bots from crawling every single one of products urls for now.
The confusing part right now is that i have several different url structures linking to every single one of my products for example www.mystore.com/all-products, www.mystore.com/product-category, etc etc.
I'm not really sure how i'd type it into the robots.txt file, or where to place the file.
any help would be appreciated thanks
-
Thanks for the detailed answer, i will give it a try!
-
Hi
This should do it, you place the robots.txt in the root directory of your site.
User-agent: * Disallow: /product-category/
You can check out some more examples here: http://www.seomoz.org/learn-seo/robotstxt
As for the multiple urls linking to the same pages, you will just need to check all possible variants and make sure you have them covered in the robots.txt file.
Google webmaster tools has a page where you can use to check if the robots.txt file is doing what you expect it to do (under Health -> Blocked Urls).
It might be easier to block the pages with a meta tag as described in the link above if you are running a plugin allowing this, that should take care of all the different url structures also.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz bar not working on https://www.fitness-china.com/gym-equipment-names-pictures-prices
Moz bar not working on our website about gym equipment names https://www.fitness-china.com/gym-equipment-names-pictures-prices How long fix it?
On-Page Optimization | | ahislop5740 -
Can you use the canonical tag and rel=next and rel=prev on category pages.
We have a conflict of information between our web developers and our SEO company. We are an on-line retail company hence we have a fair number of different categories. Our site is set up with the rel=next and rel=prev tags. Our SEO company have asked us to implement canonical links on our category pages and leave the rel=next and rel=prev tags as they are. Our web developers are saying by doing this we are asking Google to ignore all of our products on all of the pages except page 1 which would mean Google would not index a lot of our products. I have looked at a few articles but I am struggling to understand which way to go. Any advice would be appreciated. Thank you in advance.
On-Page Optimization | | Palmbourne0 -
Recommended Schema for a Collection/Category page?
Hi There! Taking on a small project up updating and adding in Schema to a clients site; a previous developer half put in data vocabulary. In my planning I was wondering if their would be a best schema type for category page of products - or a collection of products? Any ideas and experience? Thanks!
On-Page Optimization | | paul-bold0 -
No-index all the posts of a category
Hi everyone! I would like no-indexing all the posts of a specific category of my wordpress site. The problem is that the structure of my URL is composed without /category/: www.site-name.ext/date/post-name/
On-Page Optimization | | salvyy
so without /category-name/ Is possibile to disallow the indexing of all the posts of the category via robots.txt? Using Yoast Plugin I can put the no-index for each post, but I would like to put the no-index (or disallow/) a time for all the post of the category. Thanks in advance for your help and sorry for my english. Mike0 -
H2s & H3s for Category Navigation
Hi all. I am wondering how best to format a category navigation menu. Currently I don't think we're using H2s correctly on our website. Am I right to think that the top level category e.g. Games should be formatted as an H2 and the sub-categories underneath this should be formatted as H3s (to show a hierarchy)? Is there a limit on how many H2s and H3s you should use? Obviously only one H1 per page. Thanks in advance Paul
On-Page Optimization | | kevinliao0 -
Do Parent Categories Hurt SEO?
I have parent categories and subcategories. Will it be harder for the subcategories to rank well because they have a parent category? The URL is longer, for one. I am just wondering if I should not have parent categories. I have one category page doing really well and I am trying to boost the others (most of which are subcategories) and this is a concern for me. Thanks! Edit: I also have a category that has 2 parent categories. I want it automatically in those 2 categories and one of its own. By itself it is very important keyword. Is this ok or should I have it be a parent category?
On-Page Optimization | | 2bloggers0 -
How to avoid keyword stuffing on e-Commerce Category pages
Hi, I'm optimizing a large, consumer electronic e-commerce superstore. Based on client's choice of keywords, I'm using product category pages as my target urls. Because of the proprietary CMS structure, product names and titles, featured on my landing pages (product category pages) create a keyword overkill, affecting various ranking factors. For example, one of the target urls / landing pages, dedicated to a specific product category, mentions the keyword over 190 times because of so many product titles in the "body" section. Would inline "rel="canonical" help? If yes, what part of the website should it "canonize"? If rel="canonical" is not the answer, what strategies would you suggest? Thanks!
On-Page Optimization | | dimanyc0 -
How do you block development servers with robots.txt?
When we create client websites the urls are client.oursite.com. Google is indexing theses sites and attaching to our domain. How can we stop it with robots.txt? I've heard you need to have the robots file on both the main site and the dev sites... A code sample would be groovy. Thanks, TR
On-Page Optimization | | DisMedia0