Blocking Subdomain from Google Crawl and Index
-
Hey everybody, how is it going?
I have a simple question, that i need answered.
I have a main domain, lets call it domain.com. Recently our company will launch a series of promotions for which we will use cname subdomains, i.e try.domain.com, or buy.domain.com. They will serve a commercial objective, nothing more.
What is the best way to block such domains from being indexed in Google, also from counting as a subdomain from the domain.com. Robots.txt, No-follow, etc?
Hope to hear from you,
Best Regards,
-
Hello George, Thank you for fast answer! I read that article and there is some issue with that. if you can see at it, i'd really appreciate it. So the problem is that if i do it directly from Tumblr, it will also block it from Tumblr users. Here is the note right below that option "Allow this blog to appear in search results":
"This applies to searches on Tumblr as well as external search engines, like Google or Yahoo."Also, if i do it from GWT, i'm very concerned to remove URLs with my subdomain because i afraid it will remove all my domain. For example, my domain is abc.com and the Tumblr blog is setup on tumblr.abc.com. So i afraid if i remove tumblr.abc.com from index, it will also remove my abc.com. Please let me know what you think.
Thank you!
-
Hi Marina,
If I understand your question correctly, you just don't want your Tumblr blog to be indexed by Google. In which case these steps will help: http://yourbusiness.azcentral.com/keep-tumblr-off-google-3061.html
Regards,
George
-
Hi guys, I read your conversation. I have similar issue but my situation is slightly different. I'll really appreciate if you can help with this. So i have also a subdomain that i don't want to be indexed by Google. However, that subdomain is not in my control. I mean, i created subdomain on my hosting but it is pointing to my Tumblr blog. So i don't have access to its robot txt. So can anybody advise what can i do in this situation to noindex that subdomain?
Thanks
-
Personally I wouldn't rely just on robots.txt, as one accidental, public link to any of the pages (easier than you may think!) will result in Google indexing that subdomain page (it just won't be followed). This means that the page can get "stuck" in Google's index and to resolve it you would need to remove it using WMT (instructions here). If there were a lot of pages accidentally indexed, you would need to remove the robots.txt restriction so Google can crawl it, and put a noindex/nofollow tags on the page so Google drops it from its index.
To cut a long story short, I would do both Steps 1 and 2 outlined by Federico if you want to sleep easy at night :).
George
-
It would also be smart to add the subdomains in Webmaster Tools in case one does get indexed and you need to remove it.
-
Robots.txt is easiest and quickest way. As a back up you can use the Noindex meta tag on the pages in the subdomain
-
2 ways to do it with different effects:
-
Robots.txt in each subdomain. This will entirely block any search engine to even access those pages, so they won't know what they have inside.
User-Agent:*
Disallow: /
-
noindex tags in those pages. This method allows crawlers to read the page and maybe index (if you set a "follow") the pages to which you link to.or "nofollow" if you don't want the linked pages to be indexed either.
Hope that helps!
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing non www and index.php
Hi, I'm green when it comes to altering the htaccess file to remove non www and index.php. I think I've managed to redirect the urls to www however not sure if I've managed to remove the index.php. I'm pasting the contents of the htaccess file here maybe someone can identify if I have unwanted lines of code and if it is up to standard (there are a lot of comments in #) not sure if needed but I've left them as I don't want to screw up anything. Thanks 🙂 @package Joomla @copyright Copyright (C) 2005 - 2016 Open Source Matters. All rights reserved. @license GNU General Public License version 2 or later; see LICENSE.txt READ THIS COMPLETELY IF YOU CHOOSE TO USE THIS FILE! The line 'Options +FollowSymLinks' may cause problems with some server configurations. It is required for the use of mod_rewrite, but it may have already been set by your server administrator in a way that disallows changing it in this .htaccess file. If using it causes your site to produce an error, comment it out (add # to the beginning of the line), reload your site in your browser and test your sef urls. If they work, then it has been set by your server administrator and you do not need to set it here. No directory listings IndexIgnore * Can be commented out if causes errors, see notes above. Options +FollowSymlinks
On-Page Optimization | | KeithBugeja
Options -Indexes Mod_rewrite in use. RewriteEngine On
RewriteCond %{REQUEST_URI} ^/index.php/
RewriteRule ^index.php/(.*) /$1 [R,L] Begin - Rewrite rules to block out some common exploits. If you experience problems on your site then comment out the operations listed below by adding a # to the beginning of the line. This attempts to block the most common type of exploit attempts on Joomla! Block any script trying to base64_encode data within the URL. RewriteCond %{QUERY_STRING} base64_encode[^(]([^)]) [OR] Block any script that includes a0 -
Frustrated by Google Search Result
We have a page on our website for our review of the "Voltage" shisha flavor by "Social Smoke" (Social Smoke is the brand). Voltage is one of their hookah tobacco flavors. https://www.hookah.org/social-smoke-voltage-flavored-hookh-tobacco/. When I search for "Social Smoke Voltage Review", our page is at the bottom of the first page result. We have a video, decent content on the page, and a review function. We've implemented correct Schema code too: https://goo.gl/iwCP7E. When I use the page grade tool on Moz. Our page ranks B for that keyword but the results number 1 and 2 and 3 on Google all rank C or D. Our video and review schemas don't show up on Google search result either. We have a good community online. Our social media pages are popular. We share the blog posts on the social media accounts fairly regularly too. We have an old and established website. From what I understand we are following all of Google's standards and rules too. What does a website owner gotta do?
On-Page Optimization | | Heydarian0 -
NOINDEX, FOLLOW on product page - how about images indexing?
Hi, Since we have a lot of similar products with duplicate descriptions, I decided to NOINDEX, FOLLOW most of these different variants which have duplicate content. However, I guess it would be useful in marketing terms if Google image search still listed the images of the products in image search. How does the image search of Google actually work - does it read the NOINDEX on the product page and therefore skip the image also or is the image search completely dependent on the ALT tag of any image found on our site? Thanks!
On-Page Optimization | | speedbird12290 -
Google Xml Sitemaps
Which plugin is good to use to create and submit my sitemap: sitemap from yoast or google xml sitemap plugin?
On-Page Optimization | | Sebastyan22
Which one is better? I already saw this video but I get an error when I submited it to webmaster tools and I don't know why:http://www.quicksprout.com/university/how-to-set-up-and-optimize-a-sitemap/_''Your Sitemap appears to be an HTML page. Please use a supported sitemap format instead.''_Thank you !0 -
Hiding Legitimate Content From Google Without Penalty?
Hi Around the time of the April/May 2012 penguin update we lost around 40% of traffic and ever since traffic has slowly been reduced to around 50% of what is was from last year from approx 1000 hits a day to under 500. I'm not convinced totally that penguin is to blame as link building has not been overdone especially when other companies who rank first page for "personalised gifts" have 100's of anchor links pointing back to there sites with this term as the link. I think our site may be hit due to accidental keyword stuffing! Basically when an order is placed the product page is updated with the last message to be engraved on the item, so for example if someone buys 10 wine glasses all personalised with "happy birthday mum, happy birthday dad, happy birthday son etc or if customers choose similar engraving then we have lots of combinations of the same words repeated. This information has increased the number of conversions as it gives customers ideas as what to engrave and I don't really want to remove it from the page. Could anyone advise me on what i need to do, is there a way of telling google to ignore this text, a way of hiding without being penalised? Any ideas greatly appreciated!
On-Page Optimization | | SmithyWhiffy0 -
Pages crawled dropped like a stone
set up a new campaign seomoz crawled the site and did about 1500 pages the last crawl it now only lists 1 page I can see anyway to see why, the site has not changed so why has the number of pages dropped?
On-Page Optimization | | spiralsites0 -
Google indexing https insted of http pages
Hi!
On-Page Optimization | | ovieira
First of all i have a Wordpress portuguese languagem website (**http://**bit.ly/TGjpVx). For a while, for security pourposes, i had a SSL certificate installed on my website but i didn't renew it, for a few months now. I didn't have any special https page. All pages responded using http or https. My problem is that it seems that Google still indexes some o my webpages with https and not http, so when people click on it they get a bad cached page. No good for SEO, i think. What can i do about this? I only want Google, and other serach engines, to index my clean http pages (about 70 pages). Thanks,
OV0