Best way to block a sub-domain from being indexed
-
Hello,
The search engines have indexed a sub-domain I did not want indexed its on
old.domain.com and dev.domain.com - I was going to password them but is there a best practice way to block them.
My main domain default robots.txt says :-
Sitemap: http://www.domain.com/sitemap.xml
global
User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Disallow: /trackback/
Disallow: /feed/
Disallow: /comments/
Disallow: /category//
Disallow: */trackback/
Disallow: */feed/
Disallow: /comments/
Disallow: /? -
Hi,
CleverPhD has some interesting ideas with robots.txt and Google Webmaster Tools, but simply password protecting all dev pages should keep pages out of Google's index. There's no best practice here, since a password wall will keep Googlebot out on its own.
To be doubly safe, you can also include a meta noindex tag on dev pages.
Keep in mind that once a page is in Google's index, it's going to take awhile for it to leave (unless you use CleverPhD's method). But, having a blank page in Google's index really isn't all that bad. It's there, but it won't rank for much.
Hope this helps,
Kristina
-
I've never tried a method like this - FreshFireOne, did you?
-
First and foremost when you finish all this - password protect your dev instances. A url will leak out eventually and then this happens. I know it is a PIA, but it is worth it.
To remove subdomains. Go into GWT and register the subdomains as separate websites in GWT. Create a robots.txt for each subdomain (not the one you mention, you need a robots that is specific to that subdomain that disallows all files. If you cant do that, have your subdomains include a noindex meta tag on all pages. You have to be careful with this as you do not want to push out your dev. robots.txt or the noindex meta tags to your production server, but it can be done. Talk to your devs. Then go into GWT and use the URL removal tool. Just leave it blank and it will remove the whole site.
Poof. Gone. You can then watch the GWT accounts. They will show errors for the dev site like "Severe health issues are found on your site - Some important page has been removed by request." This is a good error as it confirms that that subdomain is removed.
We actually used this not on a dev site but on our www1 server that was indexed. We use a load balancer with multiple copies of the site. www1 was completing with www. Using this above did the trick.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's the best way to use redirects on a massive site consolidation
We are migrating 13 websites into a single new domain and with that we have certain pages that will be terminated or moved to a new folder path so we need custom 301 redirects built for these. However, we have a huge database of pages that will NOT be changing folder paths and it's way too many to write custom 301's for. One idea was to use domain forwarding or a wild card redirect so that all the pages would be redirected to their same folder path on the new URL. The problem this creates though is that we would then need to build the custom 301s for content that is moving to a new folder path, hence creating 2 redirects on these pages (one for the domain forwarding, and then a second for the custom 301 pointing to a new folder). Any ideas on a better solution to this?
Intermediate & Advanced SEO | | MJTrevens0 -
Using a Sub Domain as a Main Domain?
Hi, I'm working on a site at the moment and the sub domain is acting as the main domain. This occurred when the site was redesigned and built on a sub domain for testing but it was never moved to the main domain when it went live (a couple of years ago). So little or no pages are live on domain.com but all on sub.domain.com. It's a large company but they have very poor rankings. Would you recommend that they move the sub domain back into the root folder? Does this involve renaming/re-pointing URLs? Thanks Louise
Intermediate & Advanced SEO | | MVIreland1 -
Competitors ranking with multiple sub-domains with no backlinks
Hi Moz, We are currently doing SEO for a hand therapy company called the Hand Therapy Group. They rank well, however, one competitor, Sydney Hand Therapy, is ranking higher than them for the term "hand therapy Sydney" (which is one of our highly focused keywords) with three different URLs (their home page, contact page and about page) despite the latter two pages have no backlinks. I understand why Google might see their homepage as being more relevant because their name is Sydney Hand Therapy (even though the Hand Therapy Group have more backlinks) but why do the other two URLs rank so well? Any help/info/advice would be brilliant! Cheers!
Intermediate & Advanced SEO | | wearehappymedia1 -
Whats the best way to set up a directory listing website
Hello all, I am building a website that lists homeschool events and field trips across various states (locker-time.com) and I have a few questions on setting it up correctly. Both the events and field trips are searchable by distance. For clarification, events are associated with a specific date and time and field trips are not. I currently have a link that says homeschool events and you enter your zip to find things close by. Is it better to create a separate page for each state I am targeting instead? So the link would be homeschool events and then a sub-link that says homeschool events in GA and the GA page brings up all the events in GA, still searchable by zip. Or does it matter? I was thinking if its a separate page, I could put keyword rich copy on top, but then clicking on the menu and choosing the appropriate sub-menu is an additional step for users on the site and as the number of states increase, that sub-menu could get pretty big. The search results pages lists the post title of any events or field trips found and the links go to a page on my website with more information, such as the location, details on the event / field trip and a link to their website. I am wondering for SEO purposes, is this the right way to do it? Or I could set up the results page to show an excerpt and some listing info and then link directly to their website. Does it matter? I was thinking a page on my own website since then I could add images (but that might end up sucking up all my hosting space). As I am adding these listings to my website, I simply copied/pasted the details on the event. Now that I'm thinking about it, original content is best, so should I stop doing that and rewrite the description in my own words? Since the events are date specific events and when they pass, they are no longer on the site, does it matter as much for the events? The field trips do not have dates associated with them, so I can probably work on creating my own descriptions for those. Just not sure if I should bother with events that are more short term. Thanks in advance for ANY advice or suggestions. I'm so looking forward to getting this all set up correctly! I find working on this SEO stuff such fun! Jeanette
Intermediate & Advanced SEO | | fatcreat0 -
Pages getting into Google Index, blocked by Robots.txt??
Hi all, So yesterday we set up to Remove URL's that got into the Google index that were not supposed to be there, due to faceted navigation... We searched for the URL's by using this in Google Search.
Intermediate & Advanced SEO | | bjs2010
site:www.sekretza.com inurl:price=
site:www.sekretza.com inurl:artists= So it brings up a list of "duplicate" pages, and they have the usual: "A description for this result is not available because of this site's robots.txt – learn more." So we removed them all, and google removed them all, every single one. This morning I do a check, and I find that more are creeping in - If i take one of the suspecting dupes to the Robots.txt tester, Google tells me it's Blocked. - and yet it's appearing in their index?? I'm confused as to why a path that is blocked is able to get into the index?? I'm thinking of lifting the Robots block so that Google can see that these pages also have a Meta NOINDEX,FOLLOW tag on - but surely that will waste my crawl budget on unnecessary pages? Any ideas? thanks.0 -
301 Pandalized Domain to Authority Domain?
Hello, If I redirect a Panda penalized domain (DA 65, bad link profile) to another authority domain (DA 35, clean link profile), will it still carry a penalty? I've heard cases where a panda penalized domain moved to a brand new domain carried the penalty.
Intermediate & Advanced SEO | | mashoid0 -
Index or not index Categories
We are using Yoast Seo plugin. On the main menu we have only categories which has consist of posts and one page. We have category with villas, category with villa hotels etc. Initially we set to index and include in the sitemap posts and excluded categories, but I guess it was not correct. Would be a better way to index and include categories in the sitemap and exclude the posts in order to avoid the duplicate? It somehow does not make sense for me, If the posts are excluded and the categories included, will not then be the categories empty for google? I guess I will get crazy of this. Somebody has perhaps more experiences with this?
Intermediate & Advanced SEO | | Rebeca10 -
Are sites that leave out www from domain at a disadvantage to domains with www in url
I know this has been discussed but was wondering what would be the best approach from an SEO perspective. I quite like the idea of setting up websites with domains without www but always worry that setting up domains without www has a disadvantage because user are use to referring to sites with the www included. Thus one of my fears are that users would link back using www version which will mean even if you do a 301 redirect that some of the link juice would be lost. I know some famous sites have used this convention such as http://searchenginewatch.com/ so think it would be possible but still concerned that for new sites it would be better to rather stick to conventions. What are your opinions about this?
Intermediate & Advanced SEO | | SABest0