Help with Robots.txt On a Shared Root
-
Hi,
I posted a similar question last week asking about subdomains but a couple of complications have arisen.
Two different websites I am looking after share the same root domain which means that they will have to share the same robots.txt. Does anybody have suggestions to separate the two on the same file without complications? It's a tricky one.
Thank you in advance.
-
Okay so if you have one root domain you can only have one robots.txt file.
The reason I asked for an example is in the case there was something you could put in the robots.txt to differentiate the two.
For example if you have
thisdomain.com and thatdomain.com
However if "thatdomain.com" uses a folder called shop ("thatdomain.com/shop") than you could prefix all your robots.txt file entries with /shop provided that "thisdomain.com" doesn't use the folder shop, Then all the /shop entries would only be applicable to "thatdomain.com". Does this make sense?
Don
-
It's not so much that one is a subdomain, it's that they are as different as Google and Yahoo yet they share the same root. I wish I could show you but I can't because of confidentiality.
The 303 wasn't put in place by me, I would have strongly suggested another method. I think it was set up so that both websites could be controlled from the same login but it's opened a can of worms for SEO.
I don't want the two separate robots files, the developer insists it has to be that way.
-
Can you provide me an example of the way the domains look... Specifically where the root pages are.
Additionally, if you are redirecting 303 one of the domains to the other why do you want two different robots.txt files? The one being 303 will always redirect to the other...?
Depending on the structures you can create one robots.txt file that deals with 2 different domains provided there is something unique about the root folders.
-
Thanks for your help so far.
The two different websites are different name domains but share the same root as it's been built this way on Typo3. I don't know of the developer's justification for the 303, it's something I wish we could change.
I'm not sure if there are specific tags you can put in the sole robots.txt to differentiate the two, have read a few conflicting arguments about how to do it.
-
Okay so if you're using a 303 then you're saying the content you want for X site is actually located at Y site.Which means you do not have 2 different sub domains. So there is no need for 2 robots.txt files and your developer is correct you can't use 2 robots.txt files. Since one site would be pointing to the other you only have one sub-domain.
However, 303 is in general a poor way to use a redirect and likely should be 301.. but I would have to understand why the 303 is being used to say that with 100% certainty. See a quick article about 303 here..
Hope this answers the question,
Don
-
It's Fasthosts. The developer is certain that we can't use the two separate robots files. The second website has been set up on a 303.
-
What host are you using?
-
The developer of the website insists that they have to share the same robots.txt, I am really not sure how he's set it up this way. I am beyond befuddled with this!
-
The subdomain has to be separated from the root in some fashion. I would assume depending on your host that there is a separate folder for the subdomain stuff. Otherwise it would be chaos. Say you installed forums on your forum subdomain and a e-commerce on your shop subdomain... which index.php page would be served?
There has to be some separation, review your file manager and look for the sub-domain folders. Once found you simply put a robots.txt into each of those folders.
Hope this helps,
Don
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No index detected in robots meta tag GSC issue_Help Please
Hi Everyone, We just did a site migration ( URL structure change, site redesign, CMS change). During migration, dev team messed up badly on a few things including SEO. The old site had pages canonicalized and self canonicalized <> New site doesn't have anything (CMS dev error) so we are working retroactively to add canonicalization mechanism The legacy site had URL’s ending with a trailing slash “/” <> new site got redirected to Set of url’s without “/” New site action : All robots are allowed: A new sitemap is submitted to google search console So here is my problem (it been a long 24hr night for me 🙂 ) 1. Now when I look at GSC homepage URL it says that old page is self canonicalized and currently in index (old page with a trailing slash at the end of URL). 2. When I try to perform a live URL test, I get the message "No: 'noindex' detected in 'robots' meta tag" , so indexation cant be done. I have no idea where noindex is coming from. 3. Robots.txt in search console still showing old file ( no noindex there ) I tried to submit new file but old one still coming up. When I click on "See live robots.txt" I get current robots. 4. I see that old page is still canonicalized and attempting to index redirected old page might be confusing google Hope someone can help to get the new page indexed! I really need it 🙂 Please ping me if you need more clarification. Thank you ! Thank you
Intermediate & Advanced SEO | | bgvsiteadmin1 -
Merging Two Sites: Need Help!
I have two existing e-commerce sites. The older one, is built on the Yahoo platform and had limitations as far as user experience. The new site is built on the Magento 2 platform. We are going to be using SLI search for our search and navigation on the new Magento platform. SLI wants us to 301 all of our categories to the hosted category pages they will create, that will have a URL structure akin to site.com/shop/category-name.html. The issue is: If I want to merge the two sites, I will have to do a 301 to the category pages of the new site, which will have 301s going to the category pages hosted by SLI. I hope this makes sense! The way I see it, I have two options: Do a 301 from the old domain to categories of the new domain, and have the new domain's categories 301 to the SLI categories; or, I can do my 301s directly to the SLI hosted category pages. The downside of #1 is that I will be doing two 301s, and I know I will lose more link juice as a result. The upside of #1, is that if decide not to use SLI in the future, it is one less thing to worry about. The downside of #2, is that I will be directing all the category pages from the old site to a site I do not ultimately control. I appreciate any feedback.
Intermediate & Advanced SEO | | KH20171 -
Inbound Affiliate Links: can this solution help?
Hello everyone, I have a pretty large e-commerce website and a bunch (about 1,000) affiliates using our in-house affiliate system we built several years ago (about 12 years ago?). All our affiliates link to us as follows: http://mywebsite.com/page/?aff=[aff_nickname] Then our site parses the request, stores a cookie to track the user, then 301 redirects to the clean page URL below: http://mywebsite.com/page/ Since 2013 we require all affiliates to link to us by using the rel="nofollow" tag to avoid any penalties, but I still see a lot of affiliate links not using the nofollow or old affiliates that have not updated their pages. So... I was reading on this page from Google, that any possible "scheme" penalization can be fixed by using either the nofollow tag or by using an intermediate page listed on the robots.txt file: https://support.google.com/webmasters/answer/66356?hl=en Do you think that could really be a reliable solution to avoid any possible penalization coming from affiliate links not using the "nofollow" tag? I have searched and read around the web but I couldn't find any real answer to my question. Thanks in advance to anyone. Best, Fab.
Intermediate & Advanced SEO | | fablau0 -
Looking for SEO Help- Magento Temporary Redirects
We recently launched a new site (www.CanyonOS.com) on Magento Enterprise. We have run several crawl tests with Moz and keep receiving 302 redirect errors. We've used the admin console for our site to apply 301 redirects in every area that we could but have had no success. (Last audit was completed on August 14) We are receiving 301 redirects on the following types of pages totaling 43k issues 😞 A majority of these issues are when adding and comparing products to the following types of urls. domain.com**/catalog/**product_compare/ domain.com**/wishlist/**index/add/product/ domain.com**/checkout/**cart/add/ Any suggestions from any SEO gurus? Best,
Intermediate & Advanced SEO | | CanyonOS0 -
Meta NoIndex tag and Robots Disallow
Hi all, I hope you can spend some time to answer my first of a few questions 🙂 We are running a Magento site - layered/faceted navigation nightmare has created thousands of duplicate URLS! Anyway, during my process to tackle the issue, I disallowed in Robots.txt anything in the querystring that was not a p (allowed this for pagination). After checking some pages in Google, I did a site:www.mydomain.com/specificpage.html and a few duplicates came up along with the original with
Intermediate & Advanced SEO | | bjs2010
"There is no information about this page because it is blocked by robots.txt" So I had added in Meta Noindex, follow on all these duplicates also but I guess it wasnt being read because of Robots.txt. So coming to my question. Did robots.txt block access to these pages? If so, were these already in the index and after disallowing it with robots, Googlebot could not read Meta No index? Does Meta Noindex Follow on pages actually help Googlebot decide to remove these pages from index? I thought Robots would stop and prevent indexation? But I've read this:
"Noindex is a funny thing, it actually doesn’t mean “You can’t index this”, it means “You can’t show this in search results”. Robots.txt disallow means “You can’t index this” but it doesn’t mean “You can’t show it in the search results”. I'm a bit confused about how to use these in both preventing duplicate content in the first place and then helping to address dupe content once it's already in the index. Thanks! B0 -
Can Bundling Products Help eCommerce SEO?
We currently have over 13,000 products on our site. SeoMoz reports many duplicate pages, which are items that are very similar (different size, application, sku, etc.). Would it be prudent to create a bundled product that has one page, one description, a set of images and a table with add to cart buttons for all of the different products on that page? (called a bundled product in Magento). Then create 301 redirects from all of the individual pages and categories to the relevant new bundled product.
Intermediate & Advanced SEO | | iJeep0 -
Help! Is rel cononical impacting me?
Hi there. My personal site www.adamlewis.info has higher Domain Authority and Moz rank and more linking domains than the top ranking site for my name "Adam Lewis" My landing page /adam-lewis has an A Grade. Yet I am still on page 2 behind what appear to be weaker domains. The on-site report says I am not making appropriate use of Rel Cononical. This is a bit techy for me. Can anyone explain how this might or might not be affecting my ranking for "adam lewis"? Thanks guys! Adam
Intermediate & Advanced SEO | | adamlewis100 -
Does duplicate content on a sub-domain affect the rankings of root domain?
We recently moved a community website that we own to our main domain. It now lives on our website as a sub-domain. This new sub-domain has a lot of duplicate page titles. We are going to clean it up but it's huge project. (We had tried to clean it even before migrating the community website) I am wondering if this duplicate content on the new sub-domain could be hurting rankings of our root domain? How does Google treat it? From SEO best practices, I know duplicate content within site is always bad. How severe is it given the fact that it is present on a different sub-domain?
Intermediate & Advanced SEO | | Amjath0