Using one robots.txt for two websites
-
I have two websites that are hosted in the same CMS. Rather than having two separate robots.txt files (one for each domain), my web agency has created one which lists the sitemaps for both websites, like this:
User-agent: * Disallow: Sitemap: https://www.siteA.org/sitemap Sitemap: https://www.siteB.com/sitemap
Is this ok? I thought you needed one robots.txt per website which provides the URL for the sitemap. Will having both sitemap URLs listed in one robots.txt confuse the search engines?
-
Hi @gpainter,
Thanks for your help. I can't see anything specific in that link that says you can't have two sitemaps in one robots.txt. Where it mentions the sitemap it does say "You can specify multiple sitemap fields", although I'm not sure whether this means having multiple sitemap URLs under one mention of 'sitemap'?
-
@ciehmoz Hey I've replied to the other thread too.
The best case here will be to utilize different robots.txt files for both the websites.
You could've used the same robots.txt file only if the other site was on the same subdomain.
Don't forget to include the corresponding sitemaps to the new robots.txt file, hope this works out, cheers.
-
Hey @ciehmoz
Just replied to your other thread, you will need one robot.txt per site. Referring to two sitemaps in one robots.txt will confuse Google.
Info here - https://developers.google.com/search/docs/advanced/robots/robots_txt
Good Luck
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google not using redirect
We have a GEO-IP redirect in place for our domain, so that users are pointed to the subfolder relevant for their region, e.g: Visit example.com from the UK and you will be redirected to example.com/uk This works fine when you manually type the domain into your browser, however if you search for the site and come to example.com, you end up at example.com I didn't think this was too much of an issue but our subfolders /uk and /au are not getting ranked at all in Google, even for branded keywords. I'm wondering if the fact that Google isn't picking up the redirect means that the pages aren't being indexed properly? Conversely our US region (example.com/us) is being ranked well. Has anyone encountered a similar issue?
Technical SEO | | ahyde0 -
Robots.txt on refinements
In dealing with Panda do you think it is a good idea to put all refinements for category pages in the robots.txt file? We already have a lot as noindex, follow but I am wondering if it would be better to address from a crawl perspective as the pages are probably thin duplicate content to Google.
Technical SEO | | Gordian0 -
Google indexing despite robots.txt block
Hi This subdomain has about 4'000 URLs indexed in Google, although it's blocked via robots.txt: https://www.google.com/search?safe=off&q=site%3Awww1.swisscom.ch&oq=site%3Awww1.swisscom.ch This has been the case for almost a year now, and it does not look like Google tends to respect the blocking in http://www1.swisscom.ch/robots.txt Any clues why this is or what I could do to resolve it? Thanks!
Technical SEO | | zeepartner0 -
Sub Domains and Robot.txt files...
This is going to seem like a stupid question, and perhaps it is but I am pulling out what little hair I have left. I have a sub level domain on which a website sits. The Main domain has a robots.txt file that disallows all robots. It has been two weeks, I submitted the sitemap through webmaster tools and still, Google has not indexed the sub domain website. My question is, could the robots.txt file on the main domain be affecting the crawlability of the website on the sub domain? I wouldn't have thought so but I can find nothing else. Thanks in advance.
Technical SEO | | Vizergy0 -
Website Down
Hello guys, My website hasn't been reachable for couple of hours today and I can't really understand why as no links have been built, all the best practices have been followed regarding on page optimization. I also checked google webmaster tools and there are no warning messages, crawl problems or anything so I don't understand why this has happened. Now for some reason the website is up and running again.
Technical SEO | | PremioOscar1 -
Is my robots.txt file working?
Greetings from medieval York UK 🙂 Everytime to you enter my name & Liz this page is returned in Google:
Technical SEO | | Nightwing
http://www.davidclick.com/web_page/al_liz.htm But i have the following robots txt file which has been in place a few weeks User-agent: * Disallow: /york_wedding_photographer_advice_pre_wedding_photoshoot.htm Disallow: /york_wedding_photographer_advice.htm Disallow: /york_wedding_photographer_advice_copyright_free_wedding_photography.htm Disallow: /web_page/prices.htm Disallow: /web_page/about_me.htm Disallow: /web_page/thumbnails4.htm Disallow: /web_page/thumbnails.html Disallow: /web_page/al_liz.htm Disallow: /web_page/york_wedding_photographer_advice.htm Allow: / So my question is please... "Why is this page appearing in the SERPS when its blocked in the robots txt file e.g.: Disallow: /web_page/al_liz.htm" ANy insights welcome 🙂0 -
I'm redesigning a website which will have a new URL format. What's the best way to redirect all the old URLs to the new ones? Is there an automated, fast way to do this?
For example, the new URL will be: https://oregonoptimalhealth.com/about_us.html while the old one's were like this: http://www.oregonoptimalhealth.com/home/ooh/smartlist_1/services.html I have redirect almost 100 old pages to the correct new page. What's the best and easiest way to do this?
Technical SEO | | PolarisMarketing0 -
Robots.txt and 301
Hi Mozzers, Can you answer something for me please. I have a client and they have 301 re-directed the homepage '/' to '/home.aspx'. Therefore all or most of the linkjuice is being passed which is great. They have also marked the '/' as nofollow / noindex in the Robots.txt file so its not being crawled. My question is if the '/' is being denied access to the robots is it still passing on the authority for the links that go into this page? It is a 301 and not 302 so it would work under normal circumstances but as the page is not being crawled do I need to change the Robots.txt to crawl the '/'? Thanks Bush
Technical SEO | | Bush_JSM0