Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt on subdomains
-
Hi guys!
I keep reading conflicting information on this and it's left me a little unsure. Am I right in thinking that a website with a subdomain of shop.sitetitle.com will share the same robots.txt file as the root domain?
-
That's about as comprehensive an answer as I could have hoped for. Thanks Ryan, really appreciated.
-
Mostly no. I say 'mostly' because a lot of times when you look at a site using www and no-www if both of those work they're almost always pulling files from the same location (hence the warnings around duplicate content), so both www.domain.com/robots.txt and domain.com/robots.txt are going to work. This is the dominant example of a subdomain sharing a robots.txt file. However, on domains that are set up as their own subdomains they have different robots.txt. Take a look at the many differences between subdomain1-1000.wordpress.com/robots.txt vs wordpress.com/robots.txt. If you set up a subdomain that isn't just a reflection of your root domain, then you'll need to create a robots.txt file as well. Cheers!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt Tester - syntax not understood
I've looked in the robots.txt Tester and I can see 3 warnings: There is a 'syntax not understood' warning for each of these. XML Sitemaps:
Technical SEO | | JamesHancocks1
https://www.pkeducation.co.uk/post-sitemap.xml
https://www.pkeducation.co.uk/sitemap_index.xml How do I fix or reformat these to remove the warnings? Many thanks in advance.
Jim0 -
Images, CSS and Javascript on subdomain or external website
Hi guy's, I came across webshops that put images, CSS and Javascript on different websites or subdomains. Does this boost SEO results? On our Wordpress webshop all the sourcescodes are placed after our own domainname: www.ourdomainname.com/wp-includes/js/jquery/jquery.js?ver=1.11.3'
Technical SEO | | Happy-SEO
www.ourdomainname.com/wp-content/uploads/2015/09/example.jpg Examples of other website: Website 1:
https://www.zalando.nl/heren-home/ Sourcecode:
https://secure-i3.ztat.net//camp/03/d5/1a0168ac81f2ffb010803d108221.jpg
https://secure-media.ztat.net/media/cms/adproduct/ad-product.min.css?_=1447764579000 Website 2:
https://www.bol.com/nl/index.html Sourcecode:
https://s.s-bol.com/nl/static/css/main/webselfservice.1358897755.css
//s.s-bol.com/nl/upload/images/logos/bol-logo-500500.jpg Website 3:
http://www.wehkamp.nl/ Sourcecode:
https://static.wehkamp.nl/assets/styles/themes/wehkamp.color.min.css?v=f47bf1
http://assets.wehkamp.com/i/wehkamp/350-450-layer-SDD-wk51-v3.jpg0 -
Reverse proxy a successful blog from subdomain to subfolder?
I have an ecommerce site that we'll call confusedseo.com. I created a WordPress blog and CNAME'd it to blog.confusedseo.com. Since then, the blog has earned a PageRank of 3 and a decent amount of organic traffic. I am considering a reverse proxy to forward blog.confusedseo.com to confusedseo.com/blog/. As I understand it, this will greatly help the "link juice" of the root domain. However, I'm concerned about any potential harm done to the existing SEO value of the blog. What, if anything, should I be doing to ensure that the reverse proxy doesn't hurt my "juice" rather than help it?
Technical SEO | | bedbugsupply0 -
"Fourth-level" subdomains. Any negative impact compared with regular "third-level" subdomains?
Hey moz New client has a site that uses: subdomains ("third-level" stuff like location.business.com) and; "fourth-level" subdomains (location.parent.business.com) Are these fourth-level addresses at risk of being treated differently than the other subdomains? Screaming Frog, for example, doesn't return these fourth-level addresses when doing a crawl for business.com except in the External tab. But maybe I'm just configuring the crawls incorrectly. These addresses rank, but I'm worried that we're losing some link juice along the way. Any thoughts would be appreciated!
Technical SEO | | jamesm5i0 -
Block Domain in robots.txt
Hi. We had some URLs that were indexed in Google from a www1-subdomain. We have now disabled the URLs (returning a 404 - for other reasons we cannot do a redirect from www1 to www) and blocked via robots.txt. But the amount of indexed pages keeps increasing (for 2 weeks now). Unfortunately, I cannot install Webmaster Tools for this subdomain to tell Google to back off... Any ideas why this could be and whether it's normal? I can send you more domain infos by personal message if you want to have a look at it.
Technical SEO | | zeepartner0 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
Blogs are best when hosted on domain, subdomain, or...?
I’ve heard the it is a best practice to host your blog within your site. I’ve also heard it’s best to put it on a subdomain. What do you believe is the best home for your blog and why?
Technical SEO | | vernonmack0 -
Should I set up a disallow in the robots.txt for catalog search results?
When the crawl diagnostics came back for my site its showing around 3,000 pages of duplicate content. Almost all of them are of the catalog search results page. I also did a site search on Google and they have most of the results pages in their index too. I think I should just disallow the bots in the /catalogsearch/ sub folder, but I'm not sure if this will have any negative effect?
Technical SEO | | JordanJudson0