Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt on subdomains
-
Hi guys!
I keep reading conflicting information on this and it's left me a little unsure. Am I right in thinking that a website with a subdomain of shop.sitetitle.com will share the same robots.txt file as the root domain?
-
That's about as comprehensive an answer as I could have hoped for. Thanks Ryan, really appreciated.
-
Mostly no. I say 'mostly' because a lot of times when you look at a site using www and no-www if both of those work they're almost always pulling files from the same location (hence the warnings around duplicate content), so both www.domain.com/robots.txt and domain.com/robots.txt are going to work. This is the dominant example of a subdomain sharing a robots.txt file. However, on domains that are set up as their own subdomains they have different robots.txt. Take a look at the many differences between subdomain1-1000.wordpress.com/robots.txt vs wordpress.com/robots.txt. If you set up a subdomain that isn't just a reflection of your root domain, then you'll need to create a robots.txt file as well. Cheers!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt allows wp-admin/admin-ajax.php
Hello, Mozzers!
Technical SEO | | AndyKubrin
I noticed something peculiar in the robots.txt used by one of my clients: Allow: /wp-admin/admin-ajax.php What would be the purpose of allowing a search engine to crawl this file?
Is it OK? Should I do something about it?
Everything else on /wp-admin/ is disallowed.
Thanks in advance for your help.
-AK:2 -
Multiple robots.txt files on server
Hi! I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step. One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names: robots.txt (original dupplicate)
Technical SEO | | mjukhud
robots.txt-Original (original)
robots.txt-NEW (other content)
robots.txt-Working (other content dupplicate) Would really appreciate help and expertise suggestions. Thanks!0 -
Images, CSS and Javascript on subdomain or external website
Hi guy's, I came across webshops that put images, CSS and Javascript on different websites or subdomains. Does this boost SEO results? On our Wordpress webshop all the sourcescodes are placed after our own domainname: www.ourdomainname.com/wp-includes/js/jquery/jquery.js?ver=1.11.3'
Technical SEO | | Happy-SEO
www.ourdomainname.com/wp-content/uploads/2015/09/example.jpg Examples of other website: Website 1:
https://www.zalando.nl/heren-home/ Sourcecode:
https://secure-i3.ztat.net//camp/03/d5/1a0168ac81f2ffb010803d108221.jpg
https://secure-media.ztat.net/media/cms/adproduct/ad-product.min.css?_=1447764579000 Website 2:
https://www.bol.com/nl/index.html Sourcecode:
https://s.s-bol.com/nl/static/css/main/webselfservice.1358897755.css
//s.s-bol.com/nl/upload/images/logos/bol-logo-500500.jpg Website 3:
http://www.wehkamp.nl/ Sourcecode:
https://static.wehkamp.nl/assets/styles/themes/wehkamp.color.min.css?v=f47bf1
http://assets.wehkamp.com/i/wehkamp/350-450-layer-SDD-wk51-v3.jpg0 -
"Fourth-level" subdomains. Any negative impact compared with regular "third-level" subdomains?
Hey moz New client has a site that uses: subdomains ("third-level" stuff like location.business.com) and; "fourth-level" subdomains (location.parent.business.com) Are these fourth-level addresses at risk of being treated differently than the other subdomains? Screaming Frog, for example, doesn't return these fourth-level addresses when doing a crawl for business.com except in the External tab. But maybe I'm just configuring the crawls incorrectly. These addresses rank, but I'm worried that we're losing some link juice along the way. Any thoughts would be appreciated!
Technical SEO | | jamesm5i0 -
Speed benefits from loading images from a subdomain
I have read that loading images from a subdomain of your site instead of the main domain will give you speed benefits on load time. Has anyone actually seen that to be the case? Thanks!
Technical SEO | | Gordian0 -
Invisible robots.txt?
So here's a weird one... Client comes to me for some simple changes, turns out there are some major issues with the site, one of which is that none of the correct content pages are showing up in Google, just ancillary (outdated) ones. Looks like an issue because even the main homepage isn't showing up with a "site:domain.com" So, I add to Webmaster Tools and, after an hour or so, I get the red bar of doom, "robots.txt is blocking important pages." I check it out in Webmasters and, sure enough, it's a "User agent: * Disallow /" ACK! But wait... there's no robots.txt to be found on the server. I can go to domain.com/robots.txt and see it but nothing via FTP. I upload a new one and, thankfully, that is now showing but I've never seen that before. Question is: can a robots.txt file be stored in a way that can't be seen? Thanks!
Technical SEO | | joshcanhelp0 -
Subdomain Removal in Robots.txt with Conditional Logic??
I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live. My specific situation is this: I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content. Should I: a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense) b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning... Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.
Technical SEO | | ErnieB0 -
How to move my blog from subdomain to subfolder?
Not an unusual situation, I have a blog on blog.domain.com it has quite a few blog postings. The platform is old and will be scrapped, but the blog content itself is going to be moved to domain.com/blog. The current process is we are manually listing all linked to/content pages and we are going to 301 redirect them to their counterparts on the new blog. This is going to be a tedious process. A) Is there any way to automate the moving of the blog? B) What is the best way to do the massive 301 redirect, php headers, .htaccess? Should we move the individual pages with redirects, or redirect the domain in the .htaccess (this will be very difficult to match all the titles and file structure)?
Technical SEO | | MarloSchneider0