How does robots.txt affect aliased domains?
-
Several of my sites are aliased (hosted in subdirectories off the root domain on a single hosting account, but visible at www.theSubDirectorySite.com) Not ideal, I know, but that's a different issue.
I want to block bots from viewing those files that are accessible in subdirectories on the main hosting account, www.RootDomain.com/SubDirectorySite/, and force the bots to look at www.SubDirectorySite.com instead.
I utilized the canonical meta tag to point bots away from the sub directory site, but I am wondering what will happen if I use robots.txt to block those files from within the root domain.
Will the bots, specifically Google bot, still index the site at its own URL, www.AnotherSite.com even if I've blocked that directory with Disallow: /AnotherSite/ ?
THANK YOU!!!
-
I'm assuming you can't 301-redirect (and that you still need the sub-directory versions to be reachable by humans)? I'm not sure the cross-domain canonical will work completely. I don't have a good example of a sub-folder to root domain canonical implementation. If the "sites" are identical, it should be ok.
Robots.txt is going to depend a bit on how people access those. If there are links to the sub-directory versions, then blocking will cut off that link-juice (and the canonical or a 301 will be better).
Blocking the sub-directory shouldn't automatically block the domain it aliases, too, unless for some reason that sub-directory is the only crawl path Google has to the outside domain. As long as they're crawling the outside domain separately, I think you'll be ok. I'm just not sure if Robots.txt is necessary here.
Sorry, the devil may be in the details on this one. Happy to take a closer look in Private Q&A, if you want to give out some specifics.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website URL, Robots.txt and Google Search Console (www. vs non www.)
Hi MOZ Community,
Technical SEO | | Badiuzz
I would like to request your kind assistance on domain URLs - www. VS non www. Recently, my team have moved to a new website where a 301 Redirection has been done. Original URL : https://www.example.com.my/ (with www.) New URL : https://example.com.my/ (without www.) Our current robots.txt sitemap : https://www.example.com.my/sitemap.xml (with www.)
Our Google Search Console property : https://www.example.com.my/ (with www.) Question:
1. How/Should I standardize these so that Google crawler can effectively crawl my website?
2. Do I have to change back my website URLs to (with www.) or I just need to update my robots.txt?
3. How can I update my Google Search Console property to reflect accordingly (without www.), because I cannot see the options in the dashboard.
4. Is there any to dos such as Canonicalization needed, or should I wait for Google to automatically detect and change it, especially in GSC property? Really appreciate your kind assistance. Thank you,
Badiuzz0 -
Is there a tool out there to check any domain that might be pointing to my existing domain?
Is there a tool out there to check any domain that might be pointing to my existing domain?
Technical SEO | | adlev0 -
Robots.txt Disallow: / in Search Console
Two days ago I found out through search console that my website's Robots.txt has changed to User-agent: *
Technical SEO | | RAN_SEO
Disallow: / When I check the robots.txt in the website it looks fine - I see its blocked just in search console( in the robots.txt tester). when I try to do fetch as google to the homepage I see its blocked. Any ideas why would robots.txt block my website? it was fine until the weekend. before that, in the last 3 months I saw I had blocked resources in the website and I brought back pages with fetch as google. Any ideas?0 -
Domain not ranking in Google
https://www.buitenspeelgoed.nl/ is a domain acquired by our client. Previously this website was on http://www.buitenspeelgoed-keupink.nl. With the old domain they were ranking top 30 on 'buitenspeelgoed' in google.nl. Now with the new exact match domain they aren't ranking any more (for months). However, the website is indexed, as you can see on http://1l1.be/nz I don't know what to do anymore. Need some advise. What we allready have done the last months: made adjustments to the 301-redirects (this was originaly setup wrong by the webdesigner (de) optimized the homepage on 'buitenspeelgoed' (strange is the fact that the Moz robot can't access the site). Checked the robots.txt to see if the website was blocked for Google Checked the meta robots to see if the website was blocked for Google Disavowed some spammy (old) links which linked to the old domain Checked Search console > Fetch as Google if there isn't any Malware of some kind (and to see if Google can access the site) Checked Search consol to see if there manual spam actions (isn't the case) Checked for duplicate content by copy/paste some texts in Google and see if any other results are showing up (isn't the case for most of the texts) Please let me know what we can do.
Technical SEO | | InventusOnline0 -
Country Specific Domain
Guyz, we are new startups and have one very simple question regarding domain name. Should we use example.com or example.com.au ? Our Goal initially would be to target customer from Australia and gradually go global. So if we opt for .com.au we may have an edge in terms of local SEO in the beginning but lose out in the long run. What is the best way to tackle this? Thanks
Technical SEO | | WayneRooney0 -
New domain
Hi, I have a domain with no keywords on it, and I´ve been using it for years. Now I bought another domain with the keyword on it. I whant to work on seo for the second domain, with the keyword. What is the better way to work this out? 301? Duplicate de site? redirect in another way?
Technical SEO | | mgfarte0 -
Robots.txt for subdomain
Hi there Mozzers! I have a subdomain with duplicate content and I'd like to remove these pages from the mighty Google index. The problem is: the website is build in Drupal and this subdomain does not have it's own robots.txt. So I want to ask you how to disallow and noindex this subdomain. Is it possible to add this to the root robots.txt: User-agent: *
Technical SEO | | Partouter
Disallow: /subdomain.root.nl/ User-agent: Googlebot
Noindex: /subdomain.root.nl/ Thank you in advance! Partouter0 -
How will a domain change affect my rankings?
My company will undergo a domain change in the next few months. Other than implementing 301 redirects, what else can I do to help the search engines realize the site has moved? What kind of impact on rankings can I expect to see? Thanks in advance!
Technical SEO | | raylau0