Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
-
Now that Google considers subdomains as part of the TLD I'm a little leery of testing robots.txt with something like:
staging.domain.com
User-agent: *
Disallow: /in fear it might get the www.domain.com blocked as well. Has anyone had any success using robots.txt to block sub-domains? I know I could add a meta robots tag to the staging.domain.com pages but that would require a lot more work.
-
Just make sure that when/if you copy over the staging site to the live domain that you don't copy over the robots.txt, htaccess, or whatever means you use to block that site from being indexed and thus have your shiny new site be blocked.
-
I agree. The name of your subdomain being "staging" didn't register at all with me until Matt brought it up. I was offering a generic response to the subdomain question whereas I believe Matt focused on how to handle a staging site. Interesting viewpoint.
-
Matt/Ryan-
Great discussion, thanks for the input. The staging.domain.com is just one of the domains we don't want indexed. Some of them still need to be accessed by the public, some like staging could be restricted to specific IPs.
I realize after your discussion I probably should have used a different example of a sub-domain. On the other hand it might not have sparked the discussion so maybe it was a good example
-
.htaccess files can be placed at any directory level of a site so you can do it for just the subdomain or even just a directory of a domain.
-
Staging URL's are typically only used for testing so rather than do a deny I would recommend using a specific ALLOW for only the IP addresses that should be allowed access.
I would imagine you don't want it indexed because you don't want the rest of the world knowing about it.
You can also use HTACCESS to use username/passwords. It is simple but you can give that to clients if that is a concern/need.
-
Correct.
-
Toren, I would not recommend that solution. There is nothing to prevent Googlebot from crawling your site via almost any IP. If you found 100 IPs used by the crawler and blocked them all, there is nothing to stop the crawler from using IP #101 next month. Once the subdomain's content is located and indexed, it will be a headache fixing the issue.
The best solution is always going to be a noindex meta tag on the pages you do not wish to be indexed. If that method is too much work or otherwise undesirable, you can use the robots.txt solution. There is no circumstance I can imagine where you would modify your htaccess file to block googlebot.
-
Hi Matt.
Perhaps I misunderstood the question but I believe Toren only wishes to prevent the subdomain from being indexed. If you restrict subdomain access by IP it would prevent visitors from accessing the content which I don't believe is the goal.
-
Interesting, hadn't thought of using htaccess to block Googlebot.Thanks for the suggestion.
-
Thanks Ryan. So you don't see any issues with de-indexing the main site if I created a second robots.txt file, e.g.
http://staging.domin.com/robots.txt
User-agent: *
Disallow: /That was my initial thought but when Google announced they consider sub-domains part of the TLD I was afraid it might affect the htp://www.domain.com versions of the pages. So you're saying the subdomain is basically treated like a folder you block on the primary domain?
-
Use an .htaccess file to only allow from certain ip addresses or ranges.
Here is an article describing how: http://www.kirupa.com/html5/htaccess_tricks.htm
-
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
Place a robots.txt file in the root of the subdomain.
User-agent: *
Disallow: /This method will block the subdomain while leaving your primary domain unaffected.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do URLs with canonical tags get indexed by Google?
Hi, we re-branded and launched a new website in February 2016. In June we saw a steep drop in the number of URLs indexed, and there have continued to be smaller dips since. We started an account with Moz and found several thousand high priority crawl errors for duplicate pages and have since fixed those with canonical tags. However, we are still seeing the number of URLs indexed drop. Do URLs with canonical tags get indexed by Google? I can't seem to find a definitive answer on this. A good portion of our URLs have canonical tags because they are just events with different dates, but otherwise the content of the page is the same.
Technical SEO | | zasite0 -
Google showing https:// page in search results but directing to http:// page
We're a bit confused as to why Google shows a secure page https:// URL in the results for some of our pages. This includes our homepage. But when you click through it isn't taking you to the https:// page, just the normal unsecured page. This isn't happening for all of our results, most of our deeper content results are not showing as https://. I thought this might have something to do with Google conducting searches behind secure pages now, but this problem doesn't seem to affect other sites and our competitors. Any ideas as to why this is happening and how we get around it?
Technical SEO | | amiraicaew0 -
How best to deal with www.home.com and www.home.com/index.html
Firstly, this is for an .asp site - and all my usual ways of fixing this (e.g. via htaccess) don't seem to work. I'm working on a site which has www.home.com and www.home.com/index.html - both URL's resolve to the same page/content. If I simply drop a rel canonical into the page, will this solve my dupe content woes? The canonical tag would then appear in both www.home.com and www.home.com/index.html cases. If the above is Ok, which version should I be going with? - or - Thanks in advance folks,
Technical SEO | | Creatomatic
James @ Creatomatic0 -
Problem www/non-www domain rewrite
Hello, I've made a site for a client about 1 year ago. The rankings are quite okay, but the home page suffers from a penalty I think. I found out via OSE that PageAuthority strangely is higher on the 301-ed page www.myanmar-rundreisen.de - PA 32
Technical SEO | | hgw57
myanmar-rundreisen.de/ - PA 33 I don't understand what is happening here as I am using the usual htaccess 301-redirect: Rewrite domain.com -> www.domain.com RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www.myanmar-rundreisen.de [NC]
RewriteRule (.*) http://www.myanmar-rundreisen.de/$1 [L,R=301] which is working fine with other domains ... I tried also (last line) RewriteRule (.*) http://www.myanmar-rundreisen.de/$1 [L,R=301] So thanks to anyone who can share an idea on that ... Guenter K04xy.jpg0 -
Domain authority not showing on root domain?
I was going through our site earlier w/ the mozBar (still learning the tools, new here) and saw the attached image. There were far more links to the subdomain (#s on the left) than the root domain (#s on right). This is strange to me, because we are not using any subdomains. All links point to either our root domain or subfolders off our root domain. Is this hurting our ranking for the root domain? Not sure what's up with this. Zz9j0.jpg
Technical SEO | | askotzko0 -
301 redirect domain to page on another domain
Hi, If I wanted to do a 301 permanent redirect on a domain to a page on another domain will this cause any problems? Lets say I have 4 domains (all indexed with content), I decide to create a new domain with 4 pages, one for each domain. I copy the content from the old domains to the relevant page on the new domain and set it live. At the same time as setting the new site live I do a 301 permanent redirect on the 4 domains to the relevant pages on the new domain. What happens if Google indexes the new site before visiting the redirected domains, could this cause a duplicate content penalty? Cheers
Technical SEO | | activitysuper0 -
Can JavaScrip affect Google's index/ranking?
We have changed our website template about a month ago and since then we experienced a huge drop in rankings, especially with our home page. We kept the same url structure on entire website, pretty much the same content and the same on-page seo. We kind of knew we will have a rank drop but not that huge. We used to rank with the homepage on the top of the second page, and now we lost about 20-25 positions. What we changed is that we made a new homepage structure, more user-friendly and with much more organized information, we also have a slider presenting our main services. 80% of our content on the homepage is included inside the slideshow and 3 tabs, but all these elements are JavaScript. The content is unique and is seo optimized but when I am disabling the JavaScript, it becomes completely unavailable. Could this be the reason for the huge rank drop? I used the Webmaster Tolls' Fetch as Googlebot tool and it looks like Google reads perfectly what's inside the JavaScrip slideshow so I did not worried until now when I found this on SEOMoz: "Try to avoid ... using javascript ... since the search engines will ... not indexed them ... " One more weird thing is that although we have no duplicate content and the entire website has been cached, for a few pages (including the homepage), the picture snipet is from the old website. All main urls are the same, we removed some old ones that we don't need anymore, so we kept all the inbound links. The 301 redirects are properly set. But still, we have a huge rank drop. Also, (not sure if this important or not), the robots.txt file is disallowing some folders like: images, modules, templates... (Joomla components). We still have some html errors and warnings but way less than we had with the old website. Any advice would be much appreciated, thank you!
Technical SEO | | echo10 -
Do search engines still index/crawl private content?
If you have a membership site, which requires a payment to access specific content/images/videos, do search engines still use that content as a ranking/domain authority factor? Is it worth optimizing these "private" pages for SEO?
Technical SEO | | christinarule1