How does robots.txt affect aliased domains?
-
Several of my sites are aliased (hosted in subdirectories off the root domain on a single hosting account, but visible at www.theSubDirectorySite.com) Not ideal, I know, but that's a different issue.
I want to block bots from viewing those files that are accessible in subdirectories on the main hosting account, www.RootDomain.com/SubDirectorySite/, and force the bots to look at www.SubDirectorySite.com instead.
I utilized the canonical meta tag to point bots away from the sub directory site, but I am wondering what will happen if I use robots.txt to block those files from within the root domain.
Will the bots, specifically Google bot, still index the site at its own URL, www.AnotherSite.com even if I've blocked that directory with Disallow: /AnotherSite/ ?
THANK YOU!!!
-
I'm assuming you can't 301-redirect (and that you still need the sub-directory versions to be reachable by humans)? I'm not sure the cross-domain canonical will work completely. I don't have a good example of a sub-folder to root domain canonical implementation. If the "sites" are identical, it should be ok.
Robots.txt is going to depend a bit on how people access those. If there are links to the sub-directory versions, then blocking will cut off that link-juice (and the canonical or a 301 will be better).
Blocking the sub-directory shouldn't automatically block the domain it aliases, too, unless for some reason that sub-directory is the only crawl path Google has to the outside domain. As long as they're crawling the outside domain separately, I think you'll be ok. I'm just not sure if Robots.txt is necessary here.
Sorry, the devil may be in the details on this one. Happy to take a closer look in Private Q&A, if you want to give out some specifics.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt allows wp-admin/admin-ajax.php
Hello, Mozzers!
Technical SEO | | AndyKubrin
I noticed something peculiar in the robots.txt used by one of my clients: Allow: /wp-admin/admin-ajax.php What would be the purpose of allowing a search engine to crawl this file?
Is it OK? Should I do something about it?
Everything else on /wp-admin/ is disallowed.
Thanks in advance for your help.
-AK:2 -
Disallow wildcard match in Robots.txt
This is in my robots.txt file, does anyone know what this is supposed to accomplish, it doesn't appear to be blocking URLs with question marks Disallow: /?crawler=1
Technical SEO | | AmandaBridge
Disallow: /?mobile=1 Thank you0 -
One server, two domains - robots.txt allow for one domain but not other?
Hello, I would like to create a single server with two domains pointing to it. Ex: domain1.com -> myserver.com/ domain2.com -> myserver.com/subfolder. The goal is to create two separate sites on one server. I would like the second domain ( /subfolder) to be fully indexed / SEO friendly and have the robots txt file allow search bots to crawl. However, the first domain (server root) I would like to keep non-indexed, and the robots.txt file disallowing any bots / indexing. Does anyone have any suggestions for the best way to tackle this one? Thanks!
Technical SEO | | Dave1000 -
Do bad links to a sub-domain which redirects to our primary domain pass link juice and hurt rankings?
Sometime in the distant past there existed a blog.domain.com for domain.com. This was before we started work for domain.com. During the process of optimizing domain.com we decided to 301 blog.domain.com to www.domain.com. Recently, we discovered that blog.domain.com actually has a lot of bad links pointing towards it. By a lot I mean, 5000+. I am curious to hear people's opinions on the following: 1. Are they passing bad link juice? 2. does Google consider links to a sub-domain being passed through a 301 to be bad links to our primary domain? 3. The best approach to having these links removed?
Technical SEO | | Shredward0 -
301 Redirect for 3 Domains into 1 New Domain
So I wanted a quick sanity check on the htaccess syntax for migrating 3 domains into 1 new domain. For example, we're migrating 3 sites abc.com, def.com and ghi.com, all into 1 new site on ghi.com. Here's the htaccess we're placing on the root of ghi.com: redirect 301 http://www.abc.com/wines.html http://www.ghi.com/wines redirect 301 http://www.def.com/trade.html http://www.ghi.com/trade
Technical SEO | | cmaseattle
redirect 301 http://www.ghi.com/winery-tours.html http://www.ghi.com/visit/taste On the DNS side of things, we're parking abc.com and def.com on the ghi.com server. I'm not seeing examples of htaccess files for this scenario, and none that use any domain info on the "from" side of the redirect 301 syntax. Any suggestions before we pull the trigger? Thanks!0 -
Allow or Disallow First in Robots.txt
If I want to override a Disallow directive in robots.txt with an Allow command, do I have the Allow command before or after the Disallow command? example: Allow: /models/ford///page* Disallow: /models////page
Technical SEO | | irvingw0 -
Domain authority not showing on root domain?
I was going through our site earlier w/ the mozBar (still learning the tools, new here) and saw the attached image. There were far more links to the subdomain (#s on the left) than the root domain (#s on right). This is strange to me, because we are not using any subdomains. All links point to either our root domain or subfolders off our root domain. Is this hurting our ranking for the root domain? Not sure what's up with this. Zz9j0.jpg
Technical SEO | | askotzko0 -
Blocking other engines in robots.txt
If your primary target of business is not in China is their any benefit to blocking Chinese search robots in robots.txt?
Technical SEO | | Romancing0