How to block google robots from a subdomain
-
I have a subdomain that lets me preview the changes I put on my site.
The live site URL is www.site.com, working preview version is www.site.edit.com
The contents on both are almost identical
I want to block the preview version (www.site.edit.com) from Google Robots, so that they don't penalize me for duplicated content.
Is it the right way to do it:
User-Agent: *
Disallow: .edit.com/*
-
Thanks o much for your help!
-
Hi,
Probably without the www. so: site.edit.com/robots.txt because otherwise you would have a subdomain of a subdomain ;-). But the rest is perfect!
-
Thanks a lot for your answer, Martijn!
So just to make sure I got it correctly - this robots file URL should be:
?
Thanks a lot for your answer
-
Hi,
The Google Robots will look for the robots.txt in each individual root. So you need the robots.txt in the root of the subdomain not just the domain root. That's why its also possible to include a complete disallow in there and not just: .edit.com/* .
Example:
User-agent: *
Disallow: /Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt on subdomains
Hi guys! I keep reading conflicting information on this and it's left me a little unsure. Am I right in thinking that a website with a subdomain of shop.sitetitle.com will share the same robots.txt file as the root domain?
Technical SEO | | Whittie0 -
To Subdomain or Not Subdomain?
I have a client that has a construction company that services a regional area. They now developed a PRODUCT that they want to promote that would have a national reach. We are redesigning the site, with new branding and all. How do I treat the website URL structure? Is the product it's own domain because of the target market? Or should I make a subdomain because we want to tie the companies together in some fashion. Every article I read confuses me more on how to handle this. Thoughts?
Technical SEO | | cschwartzel0 -
Why did Google stop indexing my site?
Google used to crawl my site every few minutes. Suddenly it stopped and the last week it indexed 3 pages out of thousands. https://www.google.co.il/#q=site:www.yetzira.com&source=lnt&tbs=qdr:w&sa=X&ei=I9aTUfTTCaKN0wX5moCgAw&ved=0CBgQpwUoAw&bav=on.2,or.r_cp.r_qf.&fp=cfac44f10e55f418&biw=1829&bih=938 What could cause this to happen and how can I solve this problem? Thanks!
Technical SEO | | JillB20130 -
Google Index Speed Opinions
Hello Everyone, Under normal circumstances, new posts to my site are indexed almost instantly by Google. I know this because an occasional search with quotation marks surrounding the 1st paragraph of text displays my newly published page. I use this tactic from time to time to ensure contributors aren't syndicating content. My question is this: I've noticed over the last day or so that my newly published articles are not yet indexed. For example, an article that was published over 24 hours ago does not appear to be indexed yet. Is this cause for concern? Is there an average wait time for indexation? XML issue? Thanks in advance for the help/insight.
Technical SEO | | JSOC0 -
Google (GWT) says my homepage and posts are blocked by Robots.txt
I guys.. I have a very annoying issue.. My Wordpress-blog over at www.Trovatten.com has some indexation-problems.. Google Webmaster Tools data:
Technical SEO | | FrederikTrovatten22
GWT says the following: "Sitemap contains urls which are blocked by robots.txt." and shows me my homepage and my blogposts.. This is my Robots.txt: http://www.trovatten.com/robots.txt
"User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/ Do you have any idea why it says that the URL's are being blocked by robots.txt when that looks how it should?
I've read a couple of places that it can be because of a Wordpress Plugin that is creating a virtuel robots.txt, but I can't validate it.. 1. I have set WP-Privacy to crawl my site
2. I have deactivated all WP-plugins and I still get same GWT-Warnings. Looking forward to hear if you have an idea that might work!0 -
Google plus
With a single Google search, you can see regular search results, along with all sorts of results that are tailored to you -- pages shared with you by your friends, Google+ posts from people you know. **Does pages shared by friends ** Does this mean pages shared by friends on Google plus ?
Technical SEO | | seoug_20050 -
Subdomain Setting in MozBar
I'm using the MozBar on FF6 and have a question about the selector next to "Root Domain" When I set it to "Subdomain", I notice that DmR and DmT are different from the root domain. Am I looking at DmR and DmT for all subdomains (www included)?
Technical SEO | | waynekolenchuk0 -
Mobile SEO or Block Crawlers?
We're in the process of launching mobile versions of many of our brand sites and our ecommerce site and one of our partners suggested that we should block crawlers on the mobile view so it doesn't compete for the same keywords as the standard site (We will be automatically redirecting mobile handsets to the mobile site). Does this advice make sense? It seems counterintuitive to me.
Technical SEO | | BruceMillard0