Best way to block a sub-domain from being indexed
-
Hello,
The search engines have indexed a sub-domain I did not want indexed its on
old.domain.com and dev.domain.com - I was going to password them but is there a best practice way to block them.
My main domain default robots.txt says :-
Sitemap: http://www.domain.com/sitemap.xml
global
User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Disallow: /trackback/
Disallow: /feed/
Disallow: /comments/
Disallow: /category//
Disallow: */trackback/
Disallow: */feed/
Disallow: /comments/
Disallow: /? -
Hi,
CleverPhD has some interesting ideas with robots.txt and Google Webmaster Tools, but simply password protecting all dev pages should keep pages out of Google's index. There's no best practice here, since a password wall will keep Googlebot out on its own.
To be doubly safe, you can also include a meta noindex tag on dev pages.
Keep in mind that once a page is in Google's index, it's going to take awhile for it to leave (unless you use CleverPhD's method). But, having a blank page in Google's index really isn't all that bad. It's there, but it won't rank for much.
Hope this helps,
Kristina
-
I've never tried a method like this - FreshFireOne, did you?
-
First and foremost when you finish all this - password protect your dev instances. A url will leak out eventually and then this happens. I know it is a PIA, but it is worth it.
To remove subdomains. Go into GWT and register the subdomains as separate websites in GWT. Create a robots.txt for each subdomain (not the one you mention, you need a robots that is specific to that subdomain that disallows all files. If you cant do that, have your subdomains include a noindex meta tag on all pages. You have to be careful with this as you do not want to push out your dev. robots.txt or the noindex meta tags to your production server, but it can be done. Talk to your devs. Then go into GWT and use the URL removal tool. Just leave it blank and it will remove the whole site.
Poof. Gone. You can then watch the GWT accounts. They will show errors for the dev site like "Severe health issues are found on your site - Some important page has been removed by request." This is a good error as it confirms that that subdomain is removed.
We actually used this not on a dev site but on our www1 server that was indexed. We use a load balancer with multiple copies of the site. www1 was completing with www. Using this above did the trick.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the best SEO way to categorize products on an ecommerce site
What is the best way for SEO to set up categories for an ecommerce site selling beauty products. I have currently built my product categories so that if a person looks under the hydration category they find our body lotion, but also if they look under the body section of products they also will find the same body lotion. Is this a problem for SEO? I think it helps the customer find the product.
Intermediate & Advanced SEO | | Kuhliff0 -
Best Way To Go About Fixing "HTML Improvements"
So I have a site and I was creating dynamic pages for a while, what happened was some of them accidentally had lots of similar meta tags and titles. I then changed up my site but left those duplicate tags for a while, not knowing what had happened. Recently I began my SEO campaign once again and noticed that these errors were there. So i did the following. Removed the pages. Removed directories that had these dynamic pages with the remove tool in google webmasters. Blocked google from scanning those pages with the robots.txt. I have verified that the robots.txt works, the pages are longer in google search...however it still shows up in in the html improvements section after a week. (It has updated a few times). So I decided to remove the robots.txt file and now add 301 redirects. Does anyone have any experience with this and am I going about this the right away? Any additional info is greatly appreciated thanks.
Intermediate & Advanced SEO | | tarafaraz0 -
Whats the best way to structure my site?
Hi All, Hope everyone is well. I have a hypothetical and would love some experts advice. For a product like a corporate credit card what's the best URL structure to get the most out of SEO. Assuming the Page Title is Corporate Credit Card (unless this isnt the best idea? However the product is called the "corporate credit card" ). The reason this is trickier than I thought is because they say the rule of thumb is to use the plural of everything for best SEO. However I have pluralized the sub page "credit cards". www.website.com.au/products/credit-cards/corporate 2) www.website.com.au/products/credit-cards/corporate-credit-card 3) www.website.com.au/products/credit-cards/corporate-credit-cards If someone were to search for corporate credit cards would option 1&2 show up correctly? Would moz rank this as an "F" ? Thanks everyone! Dave
Intermediate & Advanced SEO | | CFCU0 -
Best Practices
Okay this would be a piece of cake for most of you out there.. What are the best practices once you add a page or piece of content on your website with a new keyword that you have never used before but plan to use it with every relevant new page you add. How do you ensure that Google will crawl that page? Secondly, if you add the new keyword in the old pieces of content/ pages you have already published by editing the content to suit that keyword, how would you ensure that it gets crawled my Google. Thanks in advance
Intermediate & Advanced SEO | | LaythDajani0 -
Best Way to Incorporate FAQs into Every Page - Duplicate Content?
Hi Mozzers, We want to incorporate a 'Dictionary' of terms onto quite a few pages on our site, similar to an FAQ system. The 'Dictionary' has 285 terms in it, with about 1 sentence of content for each one (approximately 5,000 words total). The content is unique to our site and not keyword stuffed, but I am unsure what Google will think about us having all this shared content on these pages. I have a few ideas about how we can build this, but my higher-ups really want the entire dictionary on every page. Thoughts? Image of what we're thinking here - http://screencast.com/t/GkhOktwC4I Thanks!
Intermediate & Advanced SEO | | Travis-W0 -
Using Webmaster Tools to Redirect Domain to Specific Page on Another Domain
Hey Everyone, we redirected an entire domain to a specific URL on another domain (not the homepage). We used a 301 Redirect, but I'm also wondering if I should use the Google Webmaster Tools "Change of Address" section to redirect. There is no option to redirect the old domain to the specific URL on the new domain within the "Change of Address" section. Thoughts?
Intermediate & Advanced SEO | | M_D_Golden_Peak0 -
What is the best way to scrape serps for targeted keyword research?
Wanting to use search operators such as "KEYWORD inurl:blog" to identify potential link targets, then download target url, domain and keyword into an excel file. Then use SEOTools to evaluate the urls from the list. I see the link aquisition assistant in the Moz lab, but the listed operators are limited. Appreciate any suggestions on doing this at scale, thanks!
Intermediate & Advanced SEO | | Qualbe-Marketing-Group0 -
Can a XML sitemap index point to other sitemaps indexes?
We have a massive site that is having some issue being fully crawled due to some of our site architecture and linking. Is it possible to have a XML sitemap index point to other sitemap indexes rather than standalone XML sitemaps? Has anyone done this successfully? Based upon the description here: http://sitemaps.org/protocol.php#index it seems like it should be possible. Thanks in advance for your help!
Intermediate & Advanced SEO | | CareerBliss0