Block an entire subdomain with robots.txt?
-
Is it possible to block an entire subdomain with robots.txt?
I write for a blog that has their root domain as well as a subdomain pointing to the exact same IP. Getting rid of the option is not an option so I'd like to explore other options to avoid duplicate content. Any ideas?
-
Awesome! That did the trick -- thanks for your help. The site is no longer listed
-
Fact is, the robots file alone will never work (the link has a good explanation why - short form: all it does is stop the bots from indexing again).
Best to request removal then wait a few days.
-
Yeah. As of yet, the site has not been de-indexed. We placed the conditional rule in htaccess and are getting different robots.txt files for the domain and subdomain -- so that works. But I've never done this before so I don't know how long it's supposed to take?
I'll try to verify via Webmaster Tools to speed up the process. Thanks
-
You should do a remove request in Google Webmaster Tools. You have to first verify the sub-domain then request the removal.
See this post on why the robots file alone won't work...
http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts
-
Awesome. We used your second idea and so far it looks like it is working exactly how we want. Thanks for the idea.
Will report back to confirm that the subdomain has been de-indexed.
-
Option 1 could come with a small performance hit if you have a lot of txt files being used on the server.
There shouldn't be any negative side effects to option 2 if the rewrite is clean (IE not accidently a redirect) and the content of the two files are robots compliant.
Good luck
-
Thanks for the suggestion. I'll definitely have to do a bit more research into this one to make sure that it doesn't have any negative side effects before implementation
-
We have a plugin right now that places canonical tags, but unfortunately, the canonical for the subdomain points to the subdomain. I'll look around to see if I can tweak the settings
-
Sounds like (from other discussions) you may be stuck requiring a dynamic robot.txt file which detects what domain the bot is on and changes the content accordingly. This means the server has to run all .txt file as (I presume) PHP.
Or, you could conditionally rewrite the /robot.txt URL to a new file according to sub-domain
RewriteEngine on
RewriteCond %{HTTP_HOST} ^subdomain.website.com$
RewriteRule ^robotx.txt$ robots-subdomain.txtThen add:
User-agent: *
Disallow: /to the robots-subdomain.txt file
(untested)
-
Placing canonical tags isn't an option? Detect that the page is being viewed through the subdomain, and if so, write the canonical tag on the page back to the root domain?
Or, just place a canonical tag on every page pointing back to the root domain (so the subdomain and root domain pages would both have them). Apparently, it's ok to have a canonical tag on a page pointing to itself. I haven't tried this, but if Matt Cutts says it's ok...
-
Hey Ryan,
I wasn't directly involved with the decision to create the subdomain, but I'm told that it is necessary to create in order to bypass certain elements that were affecting the root domain.
Nevertheless, it is a blog and the users now need to login to the subdomain in order to access the Wordpress backend to bypass those elements. Traffic for the site still goes to the root domain.
-
They both point to the same location on the server? So there's not a different folder for the subdomain?
If that's the case then I suggest adding a rule to your htaccess file to 301 the subdomain back to the main domain in exactly the same way people redirect from non-www to www or vice-versa. However, you should ask why the server is configured to have a duplicate subdomain? You might just edit your apache settings to get rid of that subdomain (usually done through a cpanel interface).
Here is what your htaccess might look like:
<ifmodule mod_rewrite.c="">RewriteEngine on
# Redirect non-www to wwww
RewriteCond %{HTTP_HOST} !^www.mydomain.org [NC]
RewriteRule ^(.*)$ http://www.mydomain.org/$1 [R=301,L]</ifmodule> -
Not to me LOL I think you'll need someone with a bit more expertise in this area than I to assist in this case. Kyle, I'm sorry I couldn't offer more assistance... but I don't want to tell you something if I'm not 100% sure. I suspect one of the many bright SEOmozer's will quickly come to the rescue on this one.
Andy
-
Hey Andy,
Herein lies the problem. Since the domain and subdomain point to the exact same place, they both utilize the same robots.txt file.
Does that make sense?
-
Hi Kyle Yes, you can block an entire subdomain via robots.txt, however you'll need to create a robots.txt file and place it in the root of the subdomain, then add the code to direct the bots to stay away from the entire subdomain's content.
User-agent: *
Disallow: /hope this helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Process to move blog from subdomain on Wordpress, to subfolder on BigCommerce store
Hi Having weighed up all the angles, it's time to bite the bullet and move our blog from a subdomain to a subfolder on our ecommerce store. But as someone new to SEO I am struggling to find the correct process for doing this properly for our situation. Can anyone help? I have outlined what I have learned so far in 10 steps below to hopefully help you understand my situation, where I am at and what I am struggling with. Advice, tips and suggested further reading on all/any of the 10 points would be great. Some quick background The blog is on Wordpress, and on a subdomain of our store (blog.store.com). It is four years old, with 80 original posts we want to move to a subfolder of the store (store.com/blog). The store has been built using BigCommerce, and has also been active for four years. Both the blog and the store exist as properties within our Google Search Console. The 10 steps required for the move, based on research so far, and the associated questions 1 Prepare new site: which I am guessing means reproducing all of the content over at the new subfolder location (store.com/blog)? 2 Setup errors for any pages not being transferred: I have no idea how to do this! 3 Make sure analytics is working for the new pages: it should be as the both the site the pages are moving to a subfolder of is already running with analytics and has been for years - is this a safe assumption? 4 Map all URLs being moved to their new counterparts: is this just record keeping? In a spreadsheet? Or is it a process I don't yet understand?? 5 Add rel='cannonical' tags: while I understand the concept of these, I have no idea how to implement them properly here! 6 Create and save new sitemaps: as both the blog.store.com and store.com exist in Google Search Console already, can I just refresh the sitemap for store.com/blog once the subfolder is created to achive this? 7 Setup and test 301 redirects: these can be created in BigCommerce for the new pages in the store.com/blog subfolder, and will refer back to the blog.store.com URLs the pages came from - is this the right way to do this? I am still learning here and know enough to know how much this can matter, but not enough to fully grasp the intricacies of the process 8 Move URLs simultaneously: I have no idea what this means or how to achieve it! is this just for big site moves? Does it still apply to 80 blog posts shifting from a subdomain to a subfolder on the same root? If so, how? 9 Submit a change of address in Google Search Console: This looks simple enough although Google ominously warn: ‘Don't use this tool unless you are moving your primary website presence to a new address’ Which makes me wonder how simple it really is - my primary website in this case is the store, which is not moving. address But does 'primary' here simply mean the individual property with search console? I am going in circles on this one! 10 Configure the old blog on the subdomain to redirect people and engines to the new pages: I thought the 301 redirects and rel='cannonical' stuff did that already? What did I miss?? For anyone still here, thanks for making it this far and if you still have the energy left, any advice would be great! Thanks
Intermediate & Advanced SEO | | Warren_331 -
Tools to scan entire site for duplicate content?
HI guys, Just wondering if anyone knows of any tools to scan a site for duplicate content (with other sites on the web). Looking to quickly identify product pages containing duplicate content/duplicate product descriptions for E-commerce based websites. I know copy scape can which can check up to 10,000 pages in a single operation with Batch Search. But just wondering if there is anything else on the market i should consider looking at? Cheers, Chris
Intermediate & Advanced SEO | | jayoliverwright0 -
Block lightbox content
I'm working on a new website with aggregator of content.
Intermediate & Advanced SEO | | JohnPalmer
i'll show to my users content from another website in my website in LIGHTBOX windows when they'll click on the title of the items. ** I don't have specific url for these items.
What is the best way to say for SE "Don't index these pages"?0 -
If parent domain is www, does it matter if subdomain on a different server is non-www?
If you have a main website (www.example.com) with a subdomain of the website (service.example.com) that lives on a separate server with a separate IP address, is there an SEO benefit/advantage to have having the www included in the url since the parent url includes the www? Assume: 1. Applicable 301 redirects are in place on both sites 2. No duplicate content issues Additionally, would your answer be different if the site is a .gov or .edu site vs. a .com?
Intermediate & Advanced SEO | | SEOteamfl0 -
What happens when I redirect an entire site to an established page on another site?
Hi There, I have a website which is dedicated to selling ONE product (in different forms) or my main brand site. It is branded similarly, targets similar keywords, and gets some traffic which convert to leads. Additionally, the auxiliary site has a Google Rank 2 in its own right. I am thinking of consolidating this "auxillary" site to the specific product page on my main site. The reason I am considering doing this is to give a "boost" to the main product page on our main site which has many core keywords sitting with SERP ranking of between 11-20 (so not in first 10) Because this auxiliary site it gets traffic and leads in its own right, I don't want this to be to the detriment of my leads overall. Question is - if I 301 redirect the entire domain from my auxillary site to the equivalent product on my main site am I likely to see a large "boost" to that product page? (i.e. will I likely see my ranking rise from 11 - 20 significantly)
Intermediate & Advanced SEO | | love-seo-goodness0 -
How to Disallow Tag Pages With Robot.txt
Hi i have a site which i'm dealing with that has tag pages for instant - http://www.domain.com/news/?tag=choice How can i exclude these tag pages (about 20+ being crawled and indexed by the search engines with robot.txt Also sometimes they're created dynamically so i want something which automatically excludes tage pages from being crawled and indexed. Any suggestions? Cheers, Mark
Intermediate & Advanced SEO | | monster990 -
URL blocked
Hi there, I have recently noticed that we have a link from an authoritative website, however when I looked at the code, it looked like this: <a <span="">href</a><a <span="">="http://www.mydomain.com/" title="blocked::http://www.mydomain.com/">keyword</a> You will notice that in the code there is 'blocked::' What is this? has it the same effect as a nofollow tag? Thanks for any help
Intermediate & Advanced SEO | | Paul780 -
Domain Links or SubDomain Links, which is better?
Hi, I only now found out that www.domain.com and www.domain.com/ are different. Most of my external links are directed to www.domain.com/
Intermediate & Advanced SEO | | BeytzNet
Which I understand is considered the subdomain and not the domain. Should I redirect? (and if so how?)
Should I post new links only to my domain?0