Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Block an entire subdomain with robots.txt?
-
Is it possible to block an entire subdomain with robots.txt?
I write for a blog that has their root domain as well as a subdomain pointing to the exact same IP. Getting rid of the option is not an option so I'd like to explore other options to avoid duplicate content. Any ideas?
-
Awesome! That did the trick -- thanks for your help. The site is no longer listed

-
Fact is, the robots file alone will never work (the link has a good explanation why - short form: all it does is stop the bots from indexing again).
Best to request removal then wait a few days.
-
Yeah. As of yet, the site has not been de-indexed. We placed the conditional rule in htaccess and are getting different robots.txt files for the domain and subdomain -- so that works. But I've never done this before so I don't know how long it's supposed to take?
I'll try to verify via Webmaster Tools to speed up the process. Thanks
-
You should do a remove request in Google Webmaster Tools. You have to first verify the sub-domain then request the removal.
See this post on why the robots file alone won't work...
http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts
-
Awesome. We used your second idea and so far it looks like it is working exactly how we want. Thanks for the idea.
Will report back to confirm that the subdomain has been de-indexed.
-
Option 1 could come with a small performance hit if you have a lot of txt files being used on the server.
There shouldn't be any negative side effects to option 2 if the rewrite is clean (IE not accidently a redirect) and the content of the two files are robots compliant.
Good luck
-
Thanks for the suggestion. I'll definitely have to do a bit more research into this one to make sure that it doesn't have any negative side effects before implementation
-
We have a plugin right now that places canonical tags, but unfortunately, the canonical for the subdomain points to the subdomain. I'll look around to see if I can tweak the settings
-
Sounds like (from other discussions) you may be stuck requiring a dynamic robot.txt file which detects what domain the bot is on and changes the content accordingly. This means the server has to run all .txt file as (I presume) PHP.
Or, you could conditionally rewrite the /robot.txt URL to a new file according to sub-domain
RewriteEngine on
RewriteCond %{HTTP_HOST} ^subdomain.website.com$
RewriteRule ^robotx.txt$ robots-subdomain.txtThen add:
User-agent: *
Disallow: /to the robots-subdomain.txt file
(untested)
-
Placing canonical tags isn't an option? Detect that the page is being viewed through the subdomain, and if so, write the canonical tag on the page back to the root domain?
Or, just place a canonical tag on every page pointing back to the root domain (so the subdomain and root domain pages would both have them). Apparently, it's ok to have a canonical tag on a page pointing to itself. I haven't tried this, but if Matt Cutts says it's ok...
-
Hey Ryan,
I wasn't directly involved with the decision to create the subdomain, but I'm told that it is necessary to create in order to bypass certain elements that were affecting the root domain.
Nevertheless, it is a blog and the users now need to login to the subdomain in order to access the Wordpress backend to bypass those elements. Traffic for the site still goes to the root domain.
-
They both point to the same location on the server? So there's not a different folder for the subdomain?
If that's the case then I suggest adding a rule to your htaccess file to 301 the subdomain back to the main domain in exactly the same way people redirect from non-www to www or vice-versa. However, you should ask why the server is configured to have a duplicate subdomain? You might just edit your apache settings to get rid of that subdomain (usually done through a cpanel interface).
Here is what your htaccess might look like:
<ifmodule mod_rewrite.c="">RewriteEngine on
# Redirect non-www to wwww
RewriteCond %{HTTP_HOST} !^www.mydomain.org [NC]
RewriteRule ^(.*)$ http://www.mydomain.org/$1 [R=301,L]</ifmodule> -
Not to me LOL
I think you'll need someone with a bit more expertise in this area than I to assist in this case. Kyle, I'm sorry I couldn't offer more assistance... but I don't want to tell you something if I'm not 100% sure. I suspect one of the many bright SEOmozer's will quickly come to the rescue on this one.Andy

-
Hey Andy,
Herein lies the problem. Since the domain and subdomain point to the exact same place, they both utilize the same robots.txt file.
Does that make sense?
-
Hi Kyle
Yes, you can block an entire subdomain via robots.txt, however you'll need to create a robots.txt file and place it in the root of the subdomain, then add the code to direct the bots to stay away from the entire subdomain's content.User-agent: *
Disallow: /hope this helps

Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt blocked internal resources Wordpress
Hi all, We've recently migrated a Wordpress website from staging to live, but the robots.txt was deleted. I've created the following new one: User-agent: *
Intermediate & Advanced SEO | | Mat_C
Allow: /
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Allow: /wp-admin/admin-ajax.php However, in the site audit on SemRush, I now get the mention that a lot of pages have issues with blocked internal resources in robots.txt file. These blocked internal resources are all cached and minified css elements: links, images and scripts. Does this mean that Google won't crawl some parts of these pages with blocked resources correctly and thus won't be able to follow these links and index the images? In other words, is this any cause for concern regarding SEO? Of course I can change the robots.txt again, but will urls like https://example.com/wp-content/cache/minify/df983.js end up in the index? Thanks for your thoughts!2 -
How to create a smooth blog migration from subdomain to subfolder main?
Hi mozzers, We have decided to migrate the blog subdomain to the domain's subfolder (blog.example.com to example.com/blog). To do this the most effective way and avoid impact SEO negatively I believe I have to follow this checklist: Create a list of all 301 redirects from blog.example.com/post-1 to example.com/post-1 Make sure title tags remain the same on main domain Make sure internal links remain the same Is there something else I am missing? Any other best practices? I also would like to have all blog post as AMPs. Any recommendations if this something we should do since we are not a media site? Any other tips on successfully implementing those types of pages? Thanks
Intermediate & Advanced SEO | | Ty19861 -
What does Disallow: /french-wines/?* actually do - robots.txt
Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?* Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark? Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL? I think this has been done to block URLs containing query strings. Thanks, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Subdomain replaced domain in Google SERP
Good morning, This is my first post. I found many Q&As here that mostly answer my question, but just to be sure we do this right I'm hoping the community can take a peak at my thinking below: Problem: We are relevant rank #1 for "custom poker chips" for example. We have this development website on a subdomain (http://dev.chiplab.com). On Saturday our live 'chiplab.com' main domain was replaced by 'dev.chiplab.com' in the SERP. Expected Cause: We did not add NOFOLLOW to the header tag. We also did not DISALLOW the subdomain in the robots.txt. We could have also put the 'dev.chiplab.com' subdomain behind a password wall. Solution: Add NOFOLLOW header, update robots.txt on subdomain and disallow crawl/index. Question: If we remove the subdomain from Google using WMT, will this drop us completely from the SERP? In other words, we would ideally like our root chiplab.com domain to replace the subdomain to get us back to where we were before Saturday. If the removal tool in WMT just removes the link completely, then is the only solution to wait until the site is recrawled and reindexed and hope the root chiplab.com domain ranks in place of the subdomain again? Thank you for your time, Chase
Intermediate & Advanced SEO | | chiplab0 -
Turning off a subdomain
Hi! I'm currently working with http://www.muchbetteradventures.com/. They have a previous version of the site, http://v1.muchbetteradventures.com, as sub domain on their site. I've noticed a whole bunch of indexing issues which I think are caused by this. The v1 site has several thousand pages and ranks organically for a number of terms, but the pages are not relevant for the business at this time. The main site has just over 100 pages. More than 28,400 urls are currently indexed. We are considering turning off the v1 site and noindexing it. There are no real backlinks to it. The only worry is that by removing it, it will be seen as a massive drop in content. Rankings for the main site are currently quite poor, despite good content, a decent link profile and high domain authority. Any thoughts would be much appreciated!
Intermediate & Advanced SEO | | Blink-SEO0 -
How to add subdomains to webmaster tools?
Can anyone help with how I add a sub domain to webmaster tools? Also do I need to create a seperate sitemap for each sub domain? Any help appreciated!
Intermediate & Advanced SEO | | SamCUK1 -
Google is mixing subdomains. What can we do?
Hi! I'm experiencing something that's kind of strange for me. I have my main domain let's say: www.domain.com. Then I have my mobile version in a subdomain: mobile.domain.com and I also have a german version of the website de.domain.com. When I Google my domain I have the main result linking to: www.domain.com but then Google mixes all the domains in the sites links. For example a Sing in may be linking mobile.domain.com, a How it works link may be pointing to de.domain.com, etc What's the solution? I think this is hurting a lot my position cause google sees that all are the same domain when clearly is not. thanks!!
Intermediate & Advanced SEO | | fabrizzio0 -
PDFs and images in Sub folder or subdomain?
What would you recommend as best practice? Our ecommerce site has a lot of PDFs supporting the product page. Currently they are kept in a sub domain and so are all images. Would it be better to keep them all in a subfolder? I've read about blogs being hosted on a subfolder to be better than subdomain but what about pdfs and images? thoughts?
Intermediate & Advanced SEO | | Bio-RadAbs0