Robots.txt disallow subdomain
-
Hi all,
I have a development subdomain, which gets copied to the live domain. Because I don't want this dev domain to get crawled, I'd like to implement a robots.txt for this domain only. The problem is that I don't want this robots.txt to disallow the live domain. Is there a way to create a robots.txt for this development subdomain only?
Thanks in advance!
-
I would suggest you talk to the developers as Theo suggests to exclude visitors from your test site.
-
The copying is a manual process and I don't want any risks for the live environment. A Httphandler for robots.txt could be a solution and I'm going to discuss this with one of our developers. Other suggestions are still welcome of course!
-
Do you ftp copy one domain to the other? If this is a manual process the excluding the robots.txt that is on the test domain would be as simple as excluding it.
If you automate the copy and want code to function based on base url address then you could create a Httphandler for robots.txt that delivered a different version based on the request url host in the http request header.
-
You could use enviromental variables (for example in your env.ini or config.ini file) that are set to DEVELOPMENT, STAGING, or LIVE based on the appropriate environments the code finds itself in.
With the exact same code, your website would either be limiting IP addresses (on the development environment) or allow all IP addresses (in the live environment). With this setup you can also set different variables per environment such as the level of detail that is shown in your error reporting, connect to a testing database rather than a live one, etc.
[this was supposed to be a reply, but I accidentely clicked the wrong button. Hitting 'Delete reply' results in an error.]
-
Thanks for your quick reply, Theo. Unfortunately, this htpasswd will also get copied to the live environment, so our websites will get password protected live. Could there be any other solution for this?
-
I'm sure there is, but I'm guessing you don't want any human visitors to go to your development subdomain and view what is being done there as well? I'd suggest you either limit the visitors that have access by IP address (thereby effectively blocking out Google in one move) and/or implement a .htpasswd solution where developers can log in with their credentials to your development area (which blocks out Google as well).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does a root domain get SEO power from its subdomains?
Hi there! I'd appreciate your help with the following case: a) Current 10-year-old website (community) on root domain "example.com" (250,000 incoming quality-backlinks) will move to the new subdomain "newsub.example.com" (301 redirects to the new subdomain for all current subfolders) b) A new website (shop) will launch on the root domain "example.com" Question: Will the new website on "example.com" get SEO power from the old website on "newsub.example.com"? SEO power = linkjuice/authority/trust/history/etc. from the 250,000 backlinks. What I'm trying to achieve: Maintain the built-up SEO power for the root domain "example.com" Thanks for sharing your thoughts on this! P.S. Plenty has been written about subdomains inheriting from their root domains (so please don't share input on the subdomain vs. subfolder debate). But I can't find satisfactory info about the other way around (root domains inheriting from their subdomains), e.g. if wikia.com gets SEO power from its subdomains superman.wikia.com, starwars.wikia.com, etc.)
Intermediate & Advanced SEO | | ebebeb0 -
Changing URL to a subdomain?
Hi there, I had a website www.footballshirtcollective.com that has been live since July. It contains both content and eCommerce. I am now separating out the content so that; 1. The master domain is www.footballshirtcollective.com (content) pointing to a new site 2. Subdomain is store.footballshirtcollective.com (ecommerce) - pointing to the existing site. What do you advise I can do to minimise the impact on my search? Many thanks Mike
Intermediate & Advanced SEO | | mjmaxwell0 -
URLs with parameters + canonicals + meta robots
Hi Moz community! I'm posting a new question here as I couldn't find specific answer to the case I'm facing. Along with canonical tags, we are implementing meta robots on our pages (e-commerce website with thousands of pages). Most of the cases have been covered but I still have one unanswered case: our products are linked from list pages (mostly categories) but they almost always include a tracking parameter (ie /my-product.html?ref=xxx) products urls are secured with a canonical tag (referring only to the clean url /my-product.html) but what would be the best solution regarding the meta robots? For now we opted for a meta robot 'noindex, follow' for non canonical urls (so the ones unfortunately linked from our category/list pages), but I'm afraid that it could hurt our SEO (apparently no juice is given from URLs with a noindex robots), and even maybe prevent bots from crawling our website properly ... Would it be best to have no meta robots at all on these product urls with parameters? (we obviously can't have 'index, follow' when the canonical ref points to another url!). Thanks for your help!
Intermediate & Advanced SEO | | JessicaZylberberg0 -
Cookieless subdomains Vs SEO
We have one .com that has all our unique content and then 25 other ccltd sites that are translated versions of the .com for each country we operate in. They are not linked together but we have href lang'd it all together. We now want to serve up all static content of our global website (26 local country sites, .com, .co.uk, .se, etc) from one cookie-less subdomain. Benefit is speed improvement. The question is whether from an SEO perspective, can all static content come from static.domain.com or should we do one for each ccltd where it would come form static.domain.xx (where xx is localised to the domain in question)
Intermediate & Advanced SEO | | aires-fb770 -
Meta NoIndex tag and Robots Disallow
Hi all, I hope you can spend some time to answer my first of a few questions 🙂 We are running a Magento site - layered/faceted navigation nightmare has created thousands of duplicate URLS! Anyway, during my process to tackle the issue, I disallowed in Robots.txt anything in the querystring that was not a p (allowed this for pagination). After checking some pages in Google, I did a site:www.mydomain.com/specificpage.html and a few duplicates came up along with the original with
Intermediate & Advanced SEO | | bjs2010
"There is no information about this page because it is blocked by robots.txt" So I had added in Meta Noindex, follow on all these duplicates also but I guess it wasnt being read because of Robots.txt. So coming to my question. Did robots.txt block access to these pages? If so, were these already in the index and after disallowing it with robots, Googlebot could not read Meta No index? Does Meta Noindex Follow on pages actually help Googlebot decide to remove these pages from index? I thought Robots would stop and prevent indexation? But I've read this:
"Noindex is a funny thing, it actually doesn’t mean “You can’t index this”, it means “You can’t show this in search results”. Robots.txt disallow means “You can’t index this” but it doesn’t mean “You can’t show it in the search results”. I'm a bit confused about how to use these in both preventing duplicate content in the first place and then helping to address dupe content once it's already in the index. Thanks! B0 -
Looking for re-assurance on this one: Sitemap approach for multi-subdomains
Hi All: Just looking for a bit of "yeah it'll be fine" reassurance on this before we go ahead and implement: We've got a main accommodation listing website under www.* and a separate travel content site using a completely different platform on blog.* (same domain - diffn't sub-domain). We pull in snippets of content from blog.* > www.* using a feed and we have cross-links going both ways, e.g. links to find accommodation in blog articles and links to blog articles from accommodation listings. Look-and-feel wise they're fully integrated. The blog.* site is a tab under the main nav. What i'd like to do is get Google (and others) to view this whole thing as one site - and attribute any SEO benefit of content on blog.* pages to the www.* domain. Make sense? So, done a bit of reading - and here's what i've come up with: Seperate sitemaps for each, both located in the root of www site www.example.com/sitemap-www www.example.com/sitemap-blog robots.txt in root of www site to have single sitemap entry: sitemap : www.example.com/sitemap-www robots.txt in root of blog site to have single sitemap entry: sitemap: www.example.com/sitemap-blog Submit both sitemaps to Webmaster tools. Does this sound reasonable? Any better approaches? Anything I'm missing? All input appreciated!
Intermediate & Advanced SEO | | AABAB0 -
Rebuilding a site with pre-existing high authority subdomains
I'm rebuilding a real estate website with 4 subdomains that have Page Authorities between 45 and 50. Since it's a real estate website it has 20,000+ pages of unique (listing) content PER sub-domain. The subdomains are structured like: washington.xyzrealty.com and california.xyzrealty.com. The root domain has a ~50 Page Authority. The site is about 7 years old. My preference is to focus all of my efforts on the primary domain going forward, but I don't want to waste the power of the subdomains. I'm considering: 1. Putting blogs or community/city pages on the subdomains 2. 301 redirecting all of the existing pages to matching pages on the new root domain. 3. Any other ideas??
Intermediate & Advanced SEO | | jonathanwashburn0 -
What has this subdomain done to recover from Panda?
I found that doctor.webmd.com was affected by Google Panda, and then recovered (if you look at traffic on compete.com). What do you think they did to recover?
Intermediate & Advanced SEO | | nicole.healthline0