Help with Robots.txt On a Shared Root
-
Hi,
I posted a similar question last week asking about subdomains but a couple of complications have arisen.
Two different websites I am looking after share the same root domain which means that they will have to share the same robots.txt. Does anybody have suggestions to separate the two on the same file without complications? It's a tricky one.
Thank you in advance.
-
Okay so if you have one root domain you can only have one robots.txt file.
The reason I asked for an example is in the case there was something you could put in the robots.txt to differentiate the two.
For example if you have
thisdomain.com and thatdomain.com
However if "thatdomain.com" uses a folder called shop ("thatdomain.com/shop") than you could prefix all your robots.txt file entries with /shop provided that "thisdomain.com" doesn't use the folder shop, Then all the /shop entries would only be applicable to "thatdomain.com". Does this make sense?
Don
-
It's not so much that one is a subdomain, it's that they are as different as Google and Yahoo yet they share the same root. I wish I could show you but I can't because of confidentiality.
The 303 wasn't put in place by me, I would have strongly suggested another method. I think it was set up so that both websites could be controlled from the same login but it's opened a can of worms for SEO.
I don't want the two separate robots files, the developer insists it has to be that way.
-
Can you provide me an example of the way the domains look... Specifically where the root pages are.
Additionally, if you are redirecting 303 one of the domains to the other why do you want two different robots.txt files? The one being 303 will always redirect to the other...?
Depending on the structures you can create one robots.txt file that deals with 2 different domains provided there is something unique about the root folders.
-
Thanks for your help so far.
The two different websites are different name domains but share the same root as it's been built this way on Typo3. I don't know of the developer's justification for the 303, it's something I wish we could change.
I'm not sure if there are specific tags you can put in the sole robots.txt to differentiate the two, have read a few conflicting arguments about how to do it.
-
Okay so if you're using a 303 then you're saying the content you want for X site is actually located at Y site.Which means you do not have 2 different sub domains. So there is no need for 2 robots.txt files and your developer is correct you can't use 2 robots.txt files. Since one site would be pointing to the other you only have one sub-domain.
However, 303 is in general a poor way to use a redirect and likely should be 301.. but I would have to understand why the 303 is being used to say that with 100% certainty. See a quick article about 303 here..
Hope this answers the question,
Don
-
It's Fasthosts. The developer is certain that we can't use the two separate robots files. The second website has been set up on a 303.
-
What host are you using?
-
The developer of the website insists that they have to share the same robots.txt, I am really not sure how he's set it up this way. I am beyond befuddled with this!
-
The subdomain has to be separated from the root in some fashion. I would assume depending on your host that there is a separate folder for the subdomain stuff. Otherwise it would be chaos. Say you installed forums on your forum subdomain and a e-commerce on your shop subdomain... which index.php page would be served?
There has to be some separation, review your file manager and look for the sub-domain folders. Once found you simply put a robots.txt into each of those folders.
Hope this helps,
Don
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question about robots file on mobile devices
Hi We have a robots.txt file, but do I need to create a separate file for the m.site or can I just add the line into my normal robots file. Ive just read the Google Guidelines (what a great read it was) and couldn't find my answer. Thanks in Advance Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
Problems with a website-help
Soooooo, I did a crawl report on this site : www.greatwesternflooring.com and this was what was on the report. This is a dnn site. I'm guessing the site has a redirect loop given the http status code. Can anyone help me with a fix. (the developers have said there is no redirect on the site......clearly there is....) | http://www.greatwesternflooring.com/ | 2015-01-07T21:32:25Z | 609 : Redirect to already-visited URL received for page request. | Error attempting to request page; see title for details. | 302 | http://www.greatwesternflooring.com | <colgroup><col width="319"> <col width="144"> <col width="378"> <col span="39" width="64"></colgroup>
Intermediate & Advanced SEO | | Britewave
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0 -
Blocking poor quality content areas with robots.txt
I found an interesting discussion on seoroundtable where Barry Schwartz and others were discussing using robots.txt to block low quality content areas affected by Panda. http://www.seroundtable.com/google-farmer-advice-13090.html The article is a bit dated. I was wondering what current opinions are on this. We have some dynamically generated content pages which we tried to improve after panda. Resources have been limited and alas, they are still there. Until we can officially remove them I thought it may be a good idea to just block the entire directory. I would also remove them from my sitemaps and resubmit. There are links coming in but I could redirect the important ones (was going to do that anyway). Thoughts?
Intermediate & Advanced SEO | | Eric_edvisors0 -
Robots Disallow Backslash - Is it right command
Bit skeptical, as due to dynamic url and some other linkage issue, google has crawled url with backslash and asterisk character ex - www.xyz.com/\/index.php?option=com_product www.xyz.com/\"/index.php?option=com_product Now %5c is the encoded version of \ - backslash & %22 is encoded version of asterisk Need to know for command :- User-agent: * Disallow: \As am disallowing all backslash url through this - will it only remove the backslash url which are duplicates or the entire site,
Intermediate & Advanced SEO | | Modi0 -
Robots.txt: Syntax URL to disallow
Did someone ever experience some "collateral damages" when it's about "disallowing" some URLs? Some old URLs are still present on our website and while we are "cleaning" them off the site (which takes time), I would like to to avoid their indexation through the robots.txt file. The old URLs syntax is "/brand//13" while the new ones are "/brand/samsung/13." (note that there is 2 slash on the URL after the word "brand") Do I risk to erase from the SERPs the new good URLs if I add to the robots.txt file the line "Disallow: /brand//" ? I don't think so, but thank you to everyone who will be able to help me to clear this out 🙂
Intermediate & Advanced SEO | | Kuantokusta0 -
KW density and idiot clients. HELP!!!!
I have a client who insists on using KW1 @ a 3% rate in a 600-word piece, aka 18 references to KW1 in a two page piece. I upped the KW1 count to 18, but in doing so, added 100 words of text, getting the piece to 700 words. Now the client wants 21 KW1 appearances to maintain that 3% density. If I add 3 more KW1's, I'll up the word count again, requiring more KW1's to hit the 3% mark. Any suggestions for solving the never-ending problem of KW density and idiot clients? Thanks in advance. Paul
Intermediate & Advanced SEO | | webwordslinger0 -
Will blocking urls in robots.txt void out any backlink benefits? - I'll explain...
Ok... So I add tracking parameters to some of my social media campaigns but block those parameters via robots.txt. This helps avoid duplicate content issues (Yes, I do also have correct canonical tags added)... but my question is -- Does this cause me to miss out on any backlink magic coming my way from these articles, posts or links? Example url: www.mysite.com/subject/?tracking-info-goes-here-1234 Canonical tag is: www.mysite.com/subject/ I'm blocking anything with "?tracking-info-goes-here" via robots.txt The url with the tracking info of course IS NOT indexed in Google but IT IS indexed without the tracking parameters. What are your thoughts? Should I nix the robots.txt stuff since I already have the canonical tag in place? Do you think I'm getting the backlink "juice" from all the links with the tracking parameter? What would you do? Why? Are you sure? 🙂
Intermediate & Advanced SEO | | AubieJon0 -
Robots.txt file - How to block thosands of pages when you don't have a folder path
Hello.
Intermediate & Advanced SEO | | Unity
Just wondering if anyone has come across this and can tell me if it worked or not. Goal:
To block review pages Challenge:
The URLs aren't constructed using folders, they look like this:
www.website.com/default.aspx?z=review&PG1234
www.website.com/default.aspx?z=review&PG1235
www.website.com/default.aspx?z=review&PG1236 So the first part of the URL is the same (i.e. /default.aspx?z=review) and the unique part comes immediately after - so not as a folder. Looking at Google recommendations they show examples for ways to block 'folder directories' and 'individual pages' only. Question:
If I add the following to the Robots.txt file will it block all review pages? User-agent: *
Disallow: /default.aspx?z=review Much thanks,
Davinia0