Blocking out specific URLs with robots.txt
-
I've been trying to block out a few URLs using robots.txt, but I can't seem to get the specific one I'm trying to block. Here is an example.
I'm trying to block
but not block
It seems if it setup my robots.txt as so..
Disallow: /cats
It's blocking both urls. When I crawl the site with screaming flog, that Disallow is causing both urls to be blocked. How can I set up my robots.txt to specifically block /cats? I thought it was by doing it the way I was, but that doesn't seem to solve it.
Any help is much appreciated, thanks in advance.
-
Do not play with Robots as it may block out series of pages and folders out of index
Correct command as stated by Lesley is /cats/ . Refer official documentation
https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
-
You can either use /cats/ or /cats/* that should just block the cats folder and not the other folder. Note the first use is the preferred one.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL structure for am International website with subdirectories
Hello, The company I am working for is launching a new ecommerce website (just a handful of products).
Intermediate & Advanced SEO | | Lvet
In the first phase, the website will be English only, but it will be possible to order internationally (20 countries).
In a second phase, new languages and countries will be added. I am wondering what is the best URL structure for launch: Start with a structure similar to website.com/language/content (later on we will add other languages than english) Start with a structure similar to website.com/country/content
3) Start with a structure similar to website.com/country-language/content (at the beginning it will be all website.com/country-en/content) What do you think? Cheers
Luca0 -
Full title in url
Hi to all, what is the best url structure, to have all words in the url or to tweak url like Yoast suggest? If we remove some words from url , not focus keyword but stop words and other keywords to have shorter url will that impact search rankings? example.com/one-because-two-for-three-on-four - long url, moz crawl error, yoast red light example.com/one-two-three-four - moz ok, yoast ok Where one is a focus keyword.
Intermediate & Advanced SEO | | WalterHalicki0 -
HTML for URL markup
Hi, We are changing our URLs to be more SEO friendly. Is there any negative impact or pitfall of using <base> HTML-tag? Our developers are considering it as a possible solution for relative URLs inside HTML-markup in the Friendly URL context.
Intermediate & Advanced SEO | | theLotter0 -
2 URLS pointing to the same content
Hi, We currently have 2 URL's pointing to the same website (long story why we have it) - A & B. A is our main website but we set up B as a rewrite URL to use for our Pay Per Click campaign. Now because its the same site, but B is just a URL rewrite, Google Webmaster Tools is seeing that we have thousands of links coming in from site B to site A. I want to tell Google to ignore site B url but worried it might affect site A. I can't add a no follow link on site B as its the same content so will also be applicable on Site A. I'm also worried about using Google Disavow as it might impact on site A! Can anyone make any suggestions on what to do, as I would like to hear from anyone with experience with this or can recommend a safe option. Thanks for your time!
Intermediate & Advanced SEO | | Party_Experts0 -
Can URLs blocked with robots.txt hurt your site?
We have about 20 testing environments blocked by robots.txt, and these environments contain duplicates of our indexed content. These environments are all blocked by robots.txt, and appearing in google's index as blocked by robots.txt--can they still count against us or hurt us? I know the best practice to permanently remove these would be to use the noindex tag, but I'm wondering if we leave them they way they are if they can still hurt us.
Intermediate & Advanced SEO | | nicole.healthline0 -
Changing a url from .html to .com
Hello, I have a client that has a site with a .html plugin and I have read that its best to not have this. We currently have pages ranking with this .html plug in. However If we take the plug in out will we lose rankings? would we need a 301 or something?
Intermediate & Advanced SEO | | SEODinosaur0 -
Subdirectory URLs
If I have category pages for my site; is it better to use http://example.com/category/category or just http://example.com/category? Also, I'm creating a new section of the site; a resource center. Should the URLs of the pages in the resource center be http://example.com/learn/page or just http://example.com/page What are the reasons for the better choice?
Intermediate & Advanced SEO | | Visually0 -
Effect of URL change on Website
Hello we are developers and we have just created a new webpage for a client of us. The problem is that we can not replace the old one by the new one, cause our client has developed over 15 satellite pages that calls directly to the code of the old page. If we completly remove the old page we will make those 15 pages go down. Those pages are working over domains specially register for SEO reasons. For example Main page is www.euroair.es Satellite page is www.aireacondicionadodaikin.com Satellite page has pretty good ranking for search term "aire acondicionado daikin" As I told you, we have a new page but we can not make the page work over root domain. So we thought we could make it work over www.euroair.es/es, and make a redirection 301 of homepage and another important inner pages. We chose "/es" folder because it seems like a language folder, but we are not very sure of the effects of pages working on that folder instead of working on root directory. What do you think? Is this matter important or doesn't? Thanks
Intermediate & Advanced SEO | | teconsite.com0