Blocking out specific URLs with robots.txt
-
I've been trying to block out a few URLs using robots.txt, but I can't seem to get the specific one I'm trying to block. Here is an example.
I'm trying to block
but not block
It seems if it setup my robots.txt as so..
Disallow: /cats
It's blocking both urls. When I crawl the site with screaming flog, that Disallow is causing both urls to be blocked. How can I set up my robots.txt to specifically block /cats? I thought it was by doing it the way I was, but that doesn't seem to solve it.
Any help is much appreciated, thanks in advance.
-
Do not play with Robots as it may block out series of pages and folders out of index
Correct command as stated by Lesley is /cats/ . Refer official documentation
https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
-
You can either use /cats/ or /cats/* that should just block the cats folder and not the other folder. Note the first use is the preferred one.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Competing URLs
Hi We have a number of blogs that compete with our homepage for some keywords/phrases. The URLs of the blogs contain the keywords/phrases. I would like to re-work the blogs so that they target different keywords that don't compete and are more relevant. Should I change the URLs as I think this is what is mainly causing the issue? If so, should I 301 old URL's to the homepage? For example, say we we're a site that specialised in selling plastic cups. Currently there is a blog with the URL www.mysite.com/plastic-cups that outranks the homepage for _plastic cups. _The blog isn't particularly relevant to plastic cups and the homepage should rank for this term. How should I let Google know that it is the homepage that is most relevant for this term? Thanks
Intermediate & Advanced SEO | | Buffalo_71 -
Domain.com/old-url to domain.com/new-url
HI, I have to change old url`s to new one, for the same domain and all landing pages will be the same: domain.com/old-url I have to change to: domain.com/new-url All together more than 70.000 url. What is best way to do that? should I use 301st redirect? is it possible to do in code or how? what could you please suggest? Thank you, Edgars
Intermediate & Advanced SEO | | Edzjus3330 -
How to make Google index your site? (Blocked with robots.txt for a long time)
The problem is the for the long time we had a website m.imones.lt but it was blocked with robots.txt.
Intermediate & Advanced SEO | | FCRMediaLietuva
But after a long time we want Google to index it. We unblocked it 1 week or 8 days ago. But Google still does not recognize it. I type site:m.imones.lt and it says it is still blocked with robots.txt What should be the process to make Google crawl this mobile version faster? Thanks!0 -
URL construction in 2014
Hey guys, I was wondering if you could tell me your thoughts about how a URL is perceived by the algo in 2014? For example: http://www.moneyexpert.com/reviews/credit-cards/amex-platinum/ and lets say http://www.moneyexpert.com/reviews_credit-cards_review_amex-platinum.html In the eyes of google do both different style of url generally help google understand the same result? or will the keyword rich html url have a bigger benefit? I am looking forward to your advice on this matter. I don't plan on doing a lot of SEO but rather letting nature take its course so to speak... so i just wanted to make sure i construct this site with 'best practice'.
Intermediate & Advanced SEO | | irdeto0 -
Hash URLs
Hi Mozzers, Happy Friday! I have a client that has created some really nice pages from their old content and we want to redirect the old ones to the new pages. The way the web developers have built these new pages is to use hashbang url's for example www.website.co.uk/product#newpage My question is can I redirect urls to these kind of pages? Would it be using the .htaccess file to do it? Thanks in advance, Karl
Intermediate & Advanced SEO | | KarlBantleman0 -
Blocked from google
Hi, i used to get a lot of trafic from google but sudantly there was a problem with the website and it seams to be blocked. We are also in the middle of changing the root domain because we are making a new webpage, i have looked at the webmaster tools and corrected al the errors but the page is still not visible in google. I have also orderd a new crawl. Anyone have any trics? do i loose a lot when i move the domainname, or is this a good thing in this mater? The old one is smakenavitalia.no The new one is Marthecarrara.no Best regards Svein Økland
Intermediate & Advanced SEO | | sveinokl0 -
Changing URL Structure
We are going to be relaunching our website with a new URL structure. My question is, how is it best to deal with the migration process in terms of old URLS appearing whilst we launch the new ones. How best should we launch the new structure, considering we've in the region of 10,000 pages currently indexed in Google.
Intermediate & Advanced SEO | | NeilTompkins0 -
Robots.txt: Link Juice vs. Crawl Budget vs. Content 'Depth'
I run a quality vertical search engine. About 6 months ago we had a problem with our sitemaps, which resulted in most of our pages getting tossed out of Google's index. As part of the response, we put a bunch of robots.txt restrictions in place in our search results to prevent Google from crawling through pagination links and other parameter based variants of our results (sort order, etc). The idea was to 'preserve crawl budget' in order to speed the rate at which Google could get our millions of pages back in the index by focusing attention/resources on the right pages. The pages are back in the index now (and have been for a while), and the restrictions have stayed in place since that time. But, in doing a little SEOMoz reading this morning, I came to wonder whether that approach may now be harming us... http://www.seomoz.org/blog/restricting-robot-access-for-improved-seo
Intermediate & Advanced SEO | | kurus
http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions Specifically, I'm concerned that a) we're blocking the flow of link juice and that b) by preventing Google from crawling the full depth of our search results (i.e. pages >1), we may be making our site wrongfully look 'thin'. With respect to b), we've been hit by Panda and have been implementing plenty of changes to improve engagement, eliminate inadvertently low quality pages, etc, but we have yet to find 'the fix'... Thoughts? Kurus0