Robots.txt for subdomain
-
Hi there Mozzers!
I have a subdomain with duplicate content and I'd like to remove these pages from the mighty Google index. The problem is: the website is build in Drupal and this subdomain does not have it's own robots.txt.
So I want to ask you how to disallow and noindex this subdomain. Is it possible to add this to the root robots.txt:
User-agent: *
Disallow: /subdomain.root.nl/User-agent: Googlebot
Noindex: /subdomain.root.nl/Thank you in advance!
Partouter
-
Robots.txt work only for subdomain where it placed.
You need to create separate robots.txt for each sub-domain, Drupal allow this.
it must be located in the root directory of your subdomain Ex: /public_html/subdomain/ and can be accessed at http://subdomain.root.nl/robots.txt.
Add the following lines in the robots.txt file:
User-agent: *
Disallow: /
As alternative way you can use Robots <META> tag on each page, or use redirect to directory root.nl/subdomain and disallow it in main robots.txt. Personally i don't recommend it. -
Not sure how your server is configured but mine is set up so that subdomain.mydomain.com is a subdirectory like this:
http://www.mydomain.com/subdomain/
in robots.txt you would simply need to put
User-agent: *
Disallow: /subdomain/Others may have a better way though.
HTH
Steve
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical vs Alternate for country based subdomain dupe content?
What's the correct method for tagging dupe content between country based subdomains? We have: mydomain.com // default, en-us www.mydomain.com // en-us uk.mydomain.com // uk, en-gb au.mydomain.com // australia, en-au eu.mydomain.com // europe, en-eu In the header of each we currently have rel="alternate" tags but we're still getting dupe content warnings in Moz for the "WWW" subdomain. Question 1) Are we headed in the right direction with using alternate? Or would it be better to use canonical since the languages are technically all English, just different regions. The content is pretty much the same minus currency and localization differences. Question 2) How can we solve the dupe content between WWW and the base domain, since the above isn't working. Thanks so much
Technical SEO | | lvdh11 -
Is there a good Free tool that will check my entire subdomain for mobility issues?
I've been using the Google tool and going page by page, everything seems great. But I'd really like something that will crawl the entire subdomain and give me a report. Any suggestions?
Technical SEO | | absoauto0 -
Subdomain Severe Duplicate Content Issue
Hi A subdomain for our admin site has been indexed and it has caused over 2000 instances of duplicate content. To fix this issue, is a 301 redirect or canoncial tag the best option? http://www.example.com/services http://admin.example.com/services Really appreciate your advice J
Technical SEO | | Metricly-Marketing0 -
Can't find mistake in robots.txt
Hi all, we recently filled our robots.txt file to prevent some directories from crawling. Looks like: User-agent: * Disallow: /Views/ Disallow: /login/ Disallow: /routing/ Disallow: /Profiler/ Disallow: /LILLYPROFILER/ Disallow: /EventRweKompaktProfiler/ Disallow: /AccessIntProfiler/ Disallow: /KellyIntProfiler/ Disallow: /lilly/ now, as Google Webmaster Tools hasn't updated our robots.txt yet, I checked our robots.txt in some ckeckers. They tell me that the User agent: * contains an error. **Example:** **Line 1: Syntax error! Expected <field>:</field> <value></value> 1: User-agent: *** **`I checked other robots.txt written the same way --> they work,`** accordign to the checkers... **`Where the .... is the mistake???`** ```
Technical SEO | | accessKellyOCG0 -
Does Bing ignore robots txt files?
Bonjour from "Its a miracle is not raining" Wetherby Uk 🙂 Ok here goes... Why despite a robots text file excluding indexing to site http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google? Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below. http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg Any insights welcome 🙂
Technical SEO | | Nightwing0 -
Subdomain CMS or unique URL
I own a company for teams Ex myteams.com . A team registers and they get a site at team1.myteams.com. Content on each sub team site is mostly unique and I have several back links on each to the main site myteams.com. I also provide them with a unique URl team1.com will show team1.myteams.com. So couple questions As far as SEO should i be pushing the team1.com url or team1.myteams.com url? Is a link from team1.com or team1.myteams.com better for my site, their site or both How many back links should the sub sites have? Thanks
Technical SEO | | MichaelRyan220 -
Robots exclusion
Hi All, I have an issue whereby print versions of my articles are being flagged up as "duplicate" content / page titles. In order to get around this, I feel that the easiest way is to just add them to my robots.txt document with a disallow. Here is my URL make up: Normal article: www.mysite.com/displayarticle=12345 Print version of my article www.mysite.com/displayarticle=12345&printversion=yes I know that having dynamic parameters in my URL is not best practise to say the least, but I'm stuck with this for the time being... My question is, how do I add just the print versions of articles to my robots file without disallowing articles too? Can I just add the parameter to the document like so? Disallow: &printversion=yes I also know that I can do add a meta noindex, nofollow tag into the head of my print versions, but I feel a robots.txt disallow will be somewhat easier... Many thanks in advance. Matt
Technical SEO | | Horizon0 -
Using robots.txt to deal with duplicate content
I have 2 sites with duplicate content issues. One is a wordpress blog. The other is a store (Pinnacle Cart). I cannot edit the canonical tag on either site. In this case, should I use robots.txt to eliminate the duplicate content?
Technical SEO | | bhsiao0