Robots.txt, does it need preceding directory structure?
-
Do you need the entire preceding path in robots.txt for it to match?
e.g:
I know if i add Disallow: /fish to robots.txt it will block
/fish
/fish.html
/fish/salmon.html
/fishheads
/fishheads/yummy.html
/fish.php?id=anythingBut would it block?:
en/fish
en/fish.html
en/fish/salmon.html
en/fishheads
en/fishheads/yummy.html
**en/fish.php?id=anything(taken from Robots.txt Specifications)** I'm hoping it actually wont match, that way writing this particular robots.txt will be much easier!
As basically I'm wanting to block many URL that have BTS- in such as:
http://www.example.com/BTS-something
http://www.example.com/BTS-somethingelse
http://www.example.com/BTS-thingybobBut have other pages that I do not want blocked, in subfolders that also have BTS- in, such as:
http://www.example.com/somesubfolder/BTS-thingy
http://www.example.com/anothersubfolder/BTS-otherthingyThanks for listening
-
Yes this is what I thought, but wanted some second opinions.
Although I wouldn't actually need a wild card after BTS, as just leaving it open is the same as using a wildcard:
/fish*.......... Equivalent to "/fish" -- the trailing wildcard is ignored. https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt Thanks for the link, I'll take a look
-
You're right in with the **Disallow: /fish **in the robots file blocking all those initial links, but if you wanted to block everything inside the /en/ folder, you would need to do disallow: /en/fish
You could use a wildcard in the robots.txt file to do something along the lines of Disallow: /BTS-*
This _'should' _work, but it's always worth checking using a tool to make sure it's all implemented correctly. Distilled did a post a while back about a JS tool which allows you to test if robots.txt files work correctly which can be found here - http://www.distilled.net/blog/seo/js-bookmarklet-for-checking-if-a-page-is-blocked-by-robots-txt/
In addition to this, you could also use the 'blocked URLs' tool in GWT to see if the pages are successfully blocked once you've implemented the code.
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need advice on overcoming a Google penalty
Here is the situation. Our website for our primary product (www.thetablift.com) has received a penalty by Google. Not long ago we had excellent rankings; (1st page) for some of our primary keywords, like "tablet stand". Now we are not in the index at all. Here is what happened (or at least what seems to have happened in my non-SEO opinion). Around October 2016, we had the "bright" idea to try and emulate a campaign that Eat 24 did, utilizing inexpensive traffic from advertisements on porn websites. The idea was a play on a joke we often hear about our product being perfect for certain activities where one needs to free one's hands while watching a screen. Of course this is not how we market our product (it is a best selling mainstream product), but we wanted to see if we could emulate the success of another mainstream brand that utilized this kind of non-mainstream advertising. The immediate result was a whole lot of traffic, but obviously the wrong kind, as it did not convert. So we pulled the plug after about 3 days. Flash forward several months later and we not only lost our great SEO rankings, but we were removed from Google's index entirely. I assume the reason for this is that somehow the website got dinged for being somehow related to porn. But of course it has nothing to do with that. So the question is: how do we go about getting un-penalized by Google? We had build up some solid SEO over the previous couple of years, and I'd like to get back to where we were, if possible. Oh, and this may or may not be relevant, but we also switched from www.tablift.com to www.thetablift.com a few months before we did this campaign. However, we used permanent redirects and did a textbook changeover, so I don't think that had any bearing. But I can't be sure. What are the steps to reverse this damage, if any? Thanks!
Intermediate & Advanced SEO | | csblev0 -
Do I need to worry about sub-domains?
Hi Moz commnity, Our website ranking was good and dropped for couple of recent months. We have around 10 sub-domains. I doubt them if they are hurting us. Being said all over in SEO industry like the sub-domains are completely different websites; will they hurt if they are not well optimised? And we have many links from our sub-domains to website top pages, is this wrong for Google? How to well maintain the sub-domains? Do I need to worry about them? Thanks
Intermediate & Advanced SEO | | vtmoz0 -
Wordpress Blog in 2 languages. How to SEO or structure it?
Hi Moz community, I have got a wordpress blog currently in the spanish language. I want to create the same blog content but in english version. (manually translate it to english instead of using translation service such as Google Translate). How should i structure the blog for SEO? How will it work? Any structure markups i should know about? Any examples? Thanks
Intermediate & Advanced SEO | | WayneRooney0 -
Huge increase in server errors and robots.txt
Hi Moz community! Wondering if someone can help? One of my clients (online fashion retailer) has been receiving huge increase in server errors (500's and 503's) over the last 6 weeks and it has got to the point where people cannot access the site because of server errors. The client has recently changed hosting companies to deal with this, and they have just told us they removed the DNS records once the name servers were changed, and they have now fixed this and are waiting for the name servers to propagate again. These errors also correlate with a huge decrease in pages blocked by robots.txt file, which makes me think someone has perhaps changed this and not told anyone... Anyone have any ideas here? It would be greatly appreciated! 🙂 I've been chasing this up with the dev agency and the hosting company for weeks, to no avail. Massive thanks in advance 🙂
Intermediate & Advanced SEO | | labelPR0 -
Multilingual blog in wordpress needs URL suggestion
I am working on a multilingual blog build in WordPress.From the first day I see the URL structure getting abrupt when I add an article in other language.
Intermediate & Advanced SEO | | csfarnsworth
The following is an example of abrupt URL.
http://muslim-academy.com/%D9%81%D8%B6%D9%84-%D9%82%D8%B1%D8%A7%D8%A1%D8%A9-%D8%A7%D9%84%D9%82%D8%B1%D8%A2%D9%86-3/ is their some plugin to fix it or some manual change?0 -
Should I Remove My Articles From Article Directories?
I have been submitting articles to directories for about 3 years. With the Panda update, it seems that these directories are now obsolete. So, if there is no link value from these articles: 1) should I remove these articles (at east the better ones) and place them on my site/blog? 2) If not, would there be any benefit at pointing some bookmarks at these old links to maybe get some juice out of them?
Intermediate & Advanced SEO | | inhouseseo0 -
.htaccess - error404 redirect within a directory?
Hi, One of my clients has a CMS website offering Health and Safety training. When the courses have been run they automatically drop off of the system which is great for the front-end of the site but this leaves pile 404 errors for the URLs. I am trying to put a .htaccess redirect in place that will redirect back to the main category for that course i/e : http://www.domain.co.uk/courses/highways/6-NRSWA/27-nrswa-operative-sept-11.html will redirect to http://www.domain.co.uk/courses/highways/6-NRSWA I have spent a looooong time hitting google for a solution but can't seem to come up with anything. If at all possible I would also like to be able to post a php variable via the redirect url so that I can display a message on the category page saying that the course is no longer available be please select a different course. i/e: http://www.domain.co.uk/courses/highways/6-NRSWA?course=not-available Any help on this would be most gratefully received.
Intermediate & Advanced SEO | | AdeLewis0 -
Most efficient way to change site structure?
I would like to change my sites structure to be more efficient for SEO. I have a fear that the changes will have a potential impact on my current rankings, but know this would be a good long term decision. My site is wordpress, so the changes are relatively easy to make. What are some ways to change the site structure without damaging your rank? I would have to have to clean up a bunch of errors, so is the best way to simply do 301 redirects on the old pages?
Intermediate & Advanced SEO | | dignan990