Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
-
Now that Google considers subdomains as part of the TLD I'm a little leery of testing robots.txt with something like:
staging.domain.com
User-agent: *
Disallow: /in fear it might get the www.domain.com blocked as well. Has anyone had any success using robots.txt to block sub-domains? I know I could add a meta robots tag to the staging.domain.com pages but that would require a lot more work.
-
Just make sure that when/if you copy over the staging site to the live domain that you don't copy over the robots.txt, htaccess, or whatever means you use to block that site from being indexed and thus have your shiny new site be blocked.

-
I agree. The name of your subdomain being "staging" didn't register at all with me until Matt brought it up. I was offering a generic response to the subdomain question whereas I believe Matt focused on how to handle a staging site. Interesting viewpoint.
-
Matt/Ryan-
Great discussion, thanks for the input. The staging.domain.com is just one of the domains we don't want indexed. Some of them still need to be accessed by the public, some like staging could be restricted to specific IPs.
I realize after your discussion I probably should have used a different example of a sub-domain. On the other hand it might not have sparked the discussion so maybe it was a good example

-
.htaccess files can be placed at any directory level of a site so you can do it for just the subdomain or even just a directory of a domain.
-
Staging URL's are typically only used for testing so rather than do a deny I would recommend using a specific ALLOW for only the IP addresses that should be allowed access.
I would imagine you don't want it indexed because you don't want the rest of the world knowing about it.
You can also use HTACCESS to use username/passwords. It is simple but you can give that to clients if that is a concern/need.
-
Correct.
-
Toren, I would not recommend that solution. There is nothing to prevent Googlebot from crawling your site via almost any IP. If you found 100 IPs used by the crawler and blocked them all, there is nothing to stop the crawler from using IP #101 next month. Once the subdomain's content is located and indexed, it will be a headache fixing the issue.
The best solution is always going to be a noindex meta tag on the pages you do not wish to be indexed. If that method is too much work or otherwise undesirable, you can use the robots.txt solution. There is no circumstance I can imagine where you would modify your htaccess file to block googlebot.
-
Hi Matt.
Perhaps I misunderstood the question but I believe Toren only wishes to prevent the subdomain from being indexed. If you restrict subdomain access by IP it would prevent visitors from accessing the content which I don't believe is the goal.
-
Interesting, hadn't thought of using htaccess to block Googlebot.Thanks for the suggestion.
-
Thanks Ryan. So you don't see any issues with de-indexing the main site if I created a second robots.txt file, e.g.
http://staging.domin.com/robots.txt
User-agent: *
Disallow: /That was my initial thought but when Google announced they consider sub-domains part of the TLD I was afraid it might affect the htp://www.domain.com versions of the pages. So you're saying the subdomain is basically treated like a folder you block on the primary domain?
-
Use an .htaccess file to only allow from certain ip addresses or ranges.
Here is an article describing how: http://www.kirupa.com/html5/htaccess_tricks.htm
-
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
Place a robots.txt file in the root of the subdomain.
User-agent: *
Disallow: /This method will block the subdomain while leaving your primary domain unaffected.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How can you promote a sub-domain ahead of a domain on the SERPs?
I have a new client that wants to promote their subdomain uk.imagemcs.com and have their main domain imagemcs.com fall off the SERPs. Objective? Get uk.imagemcs.com to rank first for UK 'brand' searches. Do a search for 'imagem creative services' and you should see the issue (it looks like rules have been applied to the robots.txt on the main domain to exclude any bots from crawling - but since they've been indexed previously I need to take action as it doesn't look great!). I think I can do this by applying a permanent redirect from the main domain to the subdomain at domain level and then no-indexing the site - and then resubmit the sitemap. My slight concern is that this no-indexing of the main domain may impact on the visibility of the subdomains (I'm dealing with uk.imagemcs.com, but there is us.imagemcs.com and de.imagemcs.com) and was looking for some assurance that this would not be the case. My understanding is that subdomains are completely distinct from domains and as such this action should have no impact on the subdomains. I asked the question on the Webmasters Forum but haven't really got anywhere
Technical SEO | | nathangdavidson2
https://productforums.google.com/forum/#!msg/webmasters/1Avupy3Uw_o/hu6oLQntCAAJ Can anyone suggest a course of action? many thanks, Nathan0 -
Are images stored in Amazon S3 buckets indexable to your domain?
We're storing all our images in S3 bucket, common practice, but we want to get these images to drive traffic back to our site -- and credit for that traffic. We've configured the URLs to be s3.owler.com/<image_name>/<image_id>. I've not seen any of these images show in our web master tools. I am wondering if we're actually not going to get the credit for these images because technically they do sit on another domain. </image_id></image_name>
Technical SEO | | mindofmiller0 -
URL Structure On Site - Currently it's domain/product-name NOT domain/category/product name is this bad?
I have a eCommerce site and the site structure is domain/product-name rather than domain/product-category/product-name Do you think this will have a negative impact SEO Wise? I have seen that some of my individual product pages do get better rankings than my categories.
Technical SEO | | the-gate-films0 -
Is it better to use XXX.com or XXX.com/index.html as canonical page
Is it better to use 301 redirects or canonical page? I suspect canonical is easier. The question is, which is the best canonical page, YYY.com or YYY.com/indexhtml? I assume YYY.com, since there will be many other pages such as YYY.com/info.html, YYY.com/services.html, etc.
Technical SEO | | Nanook10 -
Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?
Will blocking the Wayback Machine (archive.org) by adding the code they give have any impact on Google crawl and indexing/SEO? Anyone know? Thanks! ~Brett
Technical SEO | | BBuck0 -
Transfer a Main Domain to a Sub-Domain
My IT department tells me they want to transfer my main site domain, which has been in existence since 1999 as an e-commerce site (maindomain.com) to a sub-domain (www2.maindomain.com) or a completely new domain (newdomain.net). This is because we are launching a new website and B2C e-commerce engine, but we still have to maintain the legacy B2B e-commerce engine which contains hard-coded URLs, and both systems can't use the same domain. I've been researching the issue across SEOmoz, but I haven't come across this exact type of scenario (mostly I've seen a sub-domain to new domain). I see major problems with their proposal, including negative SEO impact, loss of domain authority/ranking and issues with branding. Does anyone know the exact type of impact I can expect to see in this scenario and specific steps I should go about to minimize the impact? Btw, I will be using Danny Dover's guide on properly moving domains where appropriate. Thanks!
Technical SEO | | AscendLearning0 -
What is best practice for redirecting "secondary" domain names?
For sites with multiple top-level domains that have been secured for a business or organization, I'm curious as to what is considered best practice for setting up 301 redirects for secondary domains. Is it best to do the 301 redirects at the registrar level, or the hosting level? So that .net, .biz, or other secondary domains funnel visitors to the correct primary/main domain name. I'm looking for the "best practice" answer and want to avoid duplicate content problems, or penalties from the search engines. I'm not trying to game the system with dozens of domain names, simply the handful of domains that are important to the client. I've seen some registrars recommend hosting secondary domains, and doing redirects from the hosting level (and they use meta refresh for "domain forwarding," which I want to avoid). It seems rather wasteful to set up hosting for a secondary domain and then 301 each URL.
Technical SEO | | Scott-Thomas0 -
How do I redirect index.html to the root / ?
The site I've inherited had operated on index.html at one point, and now uses index.php for the home page, which goes to the / page. The index.html was lost in migrating server hosts. How do I redirect the index.html to the / page? I've tried different options that keep giving ending up with the same 404 error. I tried a redirect from index.html to index.php which ended in an infinite loop. Because the index.html no longer exists in the root, should I created it and then add a redirect to it? Can I avoid this by editing the .htaccess? Any help is appreciated, thanks in advance!
Technical SEO | | NetPicks0