How does robots.txt affect aliased domains?
-
Several of my sites are aliased (hosted in subdirectories off the root domain on a single hosting account, but visible at www.theSubDirectorySite.com) Not ideal, I know, but that's a different issue.
I want to block bots from viewing those files that are accessible in subdirectories on the main hosting account, www.RootDomain.com/SubDirectorySite/, and force the bots to look at www.SubDirectorySite.com instead.
I utilized the canonical meta tag to point bots away from the sub directory site, but I am wondering what will happen if I use robots.txt to block those files from within the root domain.
Will the bots, specifically Google bot, still index the site at its own URL, www.AnotherSite.com even if I've blocked that directory with Disallow: /AnotherSite/ ?
THANK YOU!!!
-
I'm assuming you can't 301-redirect (and that you still need the sub-directory versions to be reachable by humans)? I'm not sure the cross-domain canonical will work completely. I don't have a good example of a sub-folder to root domain canonical implementation. If the "sites" are identical, it should be ok.
Robots.txt is going to depend a bit on how people access those. If there are links to the sub-directory versions, then blocking will cut off that link-juice (and the canonical or a 301 will be better).
Blocking the sub-directory shouldn't automatically block the domain it aliases, too, unless for some reason that sub-directory is the only crawl path Google has to the outside domain. As long as they're crawling the outside domain separately, I think you'll be ok. I'm just not sure if Robots.txt is necessary here.
Sorry, the devil may be in the details on this one. Happy to take a closer look in Private Q&A, if you want to give out some specifics.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots File
For some reason the robots file on this site: http://rushhour.net.au/robots.txt Is giving this in Google: <cite class="_Rm">www.rushhour.net.au/bootcamp.html</cite>A description for this result is not available because of this site's robots.txtLearn moreCan anyone tell me why please?thanks.
Technical SEO | | SuitsAdmin0 -
HTTP Status showing up in opensiteexplorer top pages as blocked by robot.txt file
I am trying to find an answer to this question it has alot of url on this page with no data when i go into the data source and search for noindex or robot.txt but the site is visible in the search engines ?
Technical SEO | | ReSEOlve0 -
Domain Migration Information
Hi, We are in the process of switching from *.net to *.com and I am looking for some resources on this. Any suggestions?
Technical SEO | | EcomLkwd0 -
Domain forwarding
Hi Is it ok or bad practice to domain forward shorter more memorable snappier domains used for promoting a website to a longer domain where the website actually lives, such as: Promoting in social media profiles, emails and offline literature a domain with forwarding set up like: www.brand.com To the main website: www.brandincludingprimaryproductrelatedkeyword.com And if ok (not bad practice), since its the forwarded domains that are being promoted they are hence the links most likely to be shared on social media and other websites so will they be treated like 301's and 'link building' for those will pretty much equate to link building for the main domain (or not) ? Many Thanks Dan
Technical SEO | | Dan-Lawrence0 -
Does Bing ignore robots txt files?
Bonjour from "Its a miracle is not raining" Wetherby Uk 🙂 Ok here goes... Why despite a robots text file excluding indexing to site http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google? Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below. http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg Any insights welcome 🙂
Technical SEO | | Nightwing0 -
How ro write a robots txt file to point to your site map
Good afternoon from still wet & humid wetherby UK... I want to write a robots text file that instruct the bots to index everything and give a specific location to the sitemap. The sitemap url is:http://business.leedscityregion.gov.uk/CMSPages/GoogleSiteMap.aspx Is this correct: User-agent: *
Technical SEO | | Nightwing
Disallow:
SITEMAP: http://business.leedscityregion.gov.uk/CMSPages/GoogleSiteMap.aspx Any insight welcome 🙂0 -
BEST Wordpress Robots.txt Sitemap Practice??
Alright, my question comes directly from this article by SEOmoz http://www.seomoz.org/learn-seo/robotstxt Yes, I have submitted the sitemap to google, bing's webmaster tools and and I want to add the location of our site's sitemaps and does it mean that I erase everything in the robots.txt right now and replace it with? <code>User-agent: * Disallow: Sitemap: http://www.example.com/none-standard-location/sitemap.xml</code> <code>???</code> because Wordpress comes with some default disallows like wp-admin, trackback, plugins. I have also read other questions. but was wondering if this is the correct way to add sitemap on Wordpress Robots.txt http://www.seomoz.org/q/robots-txt-question-2 http://www.seomoz.org/q/quick-robots-txt-check. http://www.seomoz.org/q/xml-sitemap-instruction-in-robots-txt-worth-doing I am using Multisite with Yoast plugin so I have more than one sitemap.xml to submit Do I erase everything in Robots.txt and replace it with how SEOmoz recommended? hmm that sounds not right. User-agent: *
Technical SEO | | joony2008
Disallow:
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-login.php
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackback
Disallow: /comments **ERASE EVERYTHING??? and changed it to** <code> <code>
<code>User-agent: *
Disallow: </code> Sitemap: http://www.example.com/sitemap_index.xml</code> <code>``` Sitemap: http://www.example.com/sub/sitemap_index.xml ```</code> <code>?????????</code> ```</code>0 -
Domain Forwarding Help
A friend of mine is a domainer and he wants to forward 21 parked niche specific domains to my site for extra type-in traffic. This will turn out to be 30 extra hits a day. Obviously, since these are parked domains, the SEO benefits are none, we just want the traffic. My questions is how to do it. These are his parked domains, and will not be redirected forever, is a 302 redirect the best plan here? He planned on just going into his hosting/domain admin and selecting "forward domain" -- is this ok too? Also, he would prefer to forward these domains to a single domain he owns, and then forward that single domain he owns to my domain. So someone who types in one of these 21 domains will go typindomain.com ---> hisredirectsite.com ---->mysite.com any implications here? What is the best option and how to do it? Thanks
Technical SEO | | terran0