Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Blocking Subdomain from Google Crawl and Index
-
Hey everybody, how is it going?
I have a simple question, that i need answered.
I have a main domain, lets call it domain.com. Recently our company will launch a series of promotions for which we will use cname subdomains, i.e try.domain.com, or buy.domain.com. They will serve a commercial objective, nothing more.
What is the best way to block such domains from being indexed in Google, also from counting as a subdomain from the domain.com. Robots.txt, No-follow, etc?
Hope to hear from you,
Best Regards,
-
Hello George, Thank you for fast answer! I read that article and there is some issue with that. if you can see at it, i'd really appreciate it. So the problem is that if i do it directly from Tumblr, it will also block it from Tumblr users. Here is the note right below that option "Allow this blog to appear in search results":
"This applies to searches on Tumblr as well as external search engines, like Google or Yahoo."Also, if i do it from GWT, i'm very concerned to remove URLs with my subdomain because i afraid it will remove all my domain. For example, my domain is abc.com and the Tumblr blog is setup on tumblr.abc.com. So i afraid if i remove tumblr.abc.com from index, it will also remove my abc.com. Please let me know what you think.
Thank you!
-
Hi Marina,
If I understand your question correctly, you just don't want your Tumblr blog to be indexed by Google. In which case these steps will help: http://yourbusiness.azcentral.com/keep-tumblr-off-google-3061.html
Regards,
George
-
Hi guys, I read your conversation. I have similar issue but my situation is slightly different. I'll really appreciate if you can help with this. So i have also a subdomain that i don't want to be indexed by Google. However, that subdomain is not in my control. I mean, i created subdomain on my hosting but it is pointing to my Tumblr blog. So i don't have access to its robot txt. So can anybody advise what can i do in this situation to noindex that subdomain?
Thanks
-
Personally I wouldn't rely just on robots.txt, as one accidental, public link to any of the pages (easier than you may think!) will result in Google indexing that subdomain page (it just won't be followed). This means that the page can get "stuck" in Google's index and to resolve it you would need to remove it using WMT (instructions here). If there were a lot of pages accidentally indexed, you would need to remove the robots.txt restriction so Google can crawl it, and put a noindex/nofollow tags on the page so Google drops it from its index.
To cut a long story short, I would do both Steps 1 and 2 outlined by Federico if you want to sleep easy at night :).
George
-
It would also be smart to add the subdomains in Webmaster Tools in case one does get indexed and you need to remove it.
-
Robots.txt is easiest and quickest way. As a back up you can use the Noindex meta tag on the pages in the subdomain
-
2 ways to do it with different effects:
-
Robots.txt in each subdomain. This will entirely block any search engine to even access those pages, so they won't know what they have inside.
User-Agent:*
Disallow: /
-
noindex tags in those pages. This method allows crawlers to read the page and maybe index (if you set a "follow") the pages to which you link to.or "nofollow" if you don't want the linked pages to be indexed either.
Hope that helps!
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
403 Forbidden Crawl report
Hi, I am getting 403 forbidden crawl report on some of my pages. However the pages are loading fine. Also when asked my web developer told that some times reports show errors when there is nothing wrong. Also will the errors affect the SEO/Ranking etc.
On-Page Optimization | | ghrisa65
Some of the links:
https://www.medistaff24.co.uk/contact-us/ https://www.medistaff24.co.uk/elderly-care-in-evesham-worcestershire/ https://www.medistaff24.co.uk/hourly-home-care-in-evesham/0 -
Does Google avoid indexing pages that include registered trademark signs?
I am suspecting that Google often hesitates to index pages that have registered trademarks on them that are marked with a ®. For example EGOL® used in the title tag or in the tag at the top of the page. Registered trademarks are everywhere and most retail product pages contain at least one of them. However, most people use the registered trademark names as text in their writing without adding the registered trademark sign of ®. Have you experienced a problem getting such pages indexed or have you read any articles about how Google treats registered trademarks?
On-Page Optimization | | EGOL0 -
How does Google handle read more tags in Wordpress
Hi Everyone I am wondering how Google handles the read more tag in Wordpress. I pasted the link to a blog post on Google and found nothing (domain.com/post#readmore). Then I paste the version without #readmore (domain.com/post) and found that Google indexed the page but with the option to click "read more" to read it. The full blog post is not in their index, just the version asking you to read more. Is this because Google hasn't gotten to it or is Google ignoring it. I am not sure but ideally I rather have the full blog post indexed, not the read more version. I am curious to whether this will cause duplicate content issues. What are your experience with this and is it advisable to use an alternate method for read more. Maybe with a Wordpress plugin. Thanks in advance.
On-Page Optimization | | gaben0 -
Can lazy loading of images affect indexing?
I am trying to diagnose a massive drop in Google rankings for my website and noticed that the date of the ranking and traffic drop coincides with Google suddenly only indexing about 10% of my images, whereas previously it was indexing about 95% of them. Wondering if addition of lazy load script to images (so they don't load from the server until visible in the browser) could cause this index blocking?
On-Page Optimization | | Gavin.Atkinson1 -
NOINDEX, FOLLOW on product page - how about images indexing?
Hi, Since we have a lot of similar products with duplicate descriptions, I decided to NOINDEX, FOLLOW most of these different variants which have duplicate content. However, I guess it would be useful in marketing terms if Google image search still listed the images of the products in image search. How does the image search of Google actually work - does it read the NOINDEX on the product page and therefore skip the image also or is the image search completely dependent on the ALT tag of any image found on our site? Thanks!
On-Page Optimization | | speedbird12290 -
Google is indexing urls with parameters despite canonical
Hello Moz, Google is indexing lots of urls despite the canonical in my site. Those urls are linked all over the site with parameters like ?, and looks like Google is indexing them despite de canonical. Is Google deciding to index those urls because they are linked all over the site? The canonical tag is well implemented.
On-Page Optimization | | Red_educativa0 -
Our sitemap is not indexed well
Hey there, Hope you guys can help. We get the following error: Nested indexing. Another Sitemap index refers to the index of sitemaps. The thing is that we cant find the error they are talking about. Thanks!!!!
On-Page Optimization | | Comunicare0 -
How do you block development servers with robots.txt?
When we create client websites the urls are client.oursite.com. Google is indexing theses sites and attaching to our domain. How can we stop it with robots.txt? I've heard you need to have the robots file on both the main site and the dev sites... A code sample would be groovy. Thanks, TR
On-Page Optimization | | DisMedia0