Block subdomain directory in robots.txt
-
Instead of block an entire sub-domain (fr.sitegeek.com) with robots.txt, we like to block one directory (fr.sitegeek.com/blog).
'fr.sitegeek.com/blog' and 'wwww.sitegeek.com/blog' contain the same articles in one language only labels are changed for 'fr' version and we suppose that duplicate content cause problem for SEO. We would like to crawl and index 'www.sitegee.com/blog' articles not 'fr.sitegeek.com/blog'.so, suggest us how to block single sub-domain directory (fr.sitegeek.com/blog) with robot.txt?
This is only for blog directory of 'fr' version even all other directories or pages would be crawled and indexed for 'fr' version.
Thanks,
Rajiv -
Hi Rajiv,
If you post the same content on both FR & EN version:
-
if both are written in English (or mainly written in English) - best option would be to have a canonical pointing to the EN version
Example: https://fr.sitegeek.com/category/shared-hosting - most of the content is in English - so in this case I would point a canonical to the EN version -
if the FR version is in French - you can use the HREF lang tag - you can use this tool to generate them, check here for common mistakes and doublecheck the final result here.
Just some remarks:
-
partially translated pages offer little value for users - so it's best to fully translate them or only refer to the EN version
-
I have a strong impression that the EN version was machine translated to the FR version. (ex. French sites never use 'Maison' to link to the Homepage - they use Acceuil). Be aware that Google is perfectly capable to detect auto-translated pages and they consider it to be bad practice (check this video of Matt Cutts - starts at 1:50). So you might want to invest in proper translation or proofreading by a native French speaker.
rgds
Dirk
-
-
Thanks Dirk,
we will fix the issue as you suggested.
Could you explain more on duplicate content if we post articles on both 'FR' and 'EN' versions?
Thanks,
Rajiv
-
Just to add to this, if your subdomain has more than /blog on it, and you only want to block /blog, change Dirk's robots.txt to:
User-agent: Googlebot
Disallow: /blogor to block more than just google:
User-agent:*
Disallow: /blog -
The easiest way would be to put the robots.txt in the root of your subdomain & block access for search engines
User-agent: Googlebot
Disallow: /If you subdomain & the main domain are sharing the same root - this option is not possible. In that case, rather than working with robots.txt I would add a canonical on each page pointing to the main domain, or block all pages in the header (if this is technically possible)
You could also check these similar questions: http://moz.com/community/q/block-an-entire-subdomain-with-robots-txt and http://moz.com/community/q/blocking-subdomain-from-google-crawl-and-index - but the answers given are the same as the options above.
Apart from the technical question, qiven the fact that only the labels are translated, these pages make little sense for human users. It would probably make more sense to link to the normal (English) version of the blog (and put (en Anglais) next to the link.
rgds,
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ecommerce store on subdomain - danger of keyword cannibalization?
Hi all, Scenario: Ecommerce website selling a food product has their store on a subdomain (store.website.com). A GOOD chunk of the URLs - primarily parameters - are blocked in Robots.txt. When I search for the products, the main domain ranks almost exclusively, while the store only ranks on deeper SERPs (several pages deep). In the end, only one variation of the product is listed on the main domain (ex: Original Flavor 1oz 24 count), while the store itself obviously has all of them (most of which are blocked by Robots.txt). Can anyone shed a little bit of insight into best practices here? The platform for the store is Shopify if that helps. My suggestion at this point is to recommend they all crawling in the subdomain Robots.txt and canonicalize the parameter pages. As for keywords, my main concern is cannibalization, or rather forcing visitors to take extra steps to get to the store on the subdomain because hardly any of the subdomain pages rank. In a perfect world, they'd have everything on their main domain and no silly subdomain. Thanks!
Intermediate & Advanced SEO | | Alces0 -
SEO suggestions for a directory
Hi all, I am new to SEO. I work for a ratings and review website, like TripAdvisor and LinkedIn. How would one go about setting up SEO strategy for national directories that have local suggested pages? What can be a good practice. For example, Tripadvisor has many different restaurants across the UK. What would they do to improve their SEO? How do they target correct links? How do they go about building their Moz Score? Would really appreciate your thoughts and suggestions. Thanks!
Intermediate & Advanced SEO | | Eric_S
Eric0 -
Subdomain vs totally new domain
Problem: Our organization publish maps for public viewing using google maps. We are currently getting limited value from these links. We need to separate our public and private maps for infrastructure purposes, and are weighing up the strengths and weaknesses of separating by domain or sub domain with regards SEO and infrastructure. Current situation: maps.mycompany.com currently has a page authority of 30 and mycompany.com has a domain authority of 39. We are currently only getting links from 8 maps which are shared via social media whereas most people embed our maps on their website using an iframe which I believe doesn't do us any favour with SEO. We currently have approx 3K public maps. Question: What SEO impact can you see if we move our public maps from the subdomain maps.mycompany.com to mycompanypublicmaps.com? Thanks in advance for your help and happy to give more info if you need it!
Intermediate & Advanced SEO | | eSpatial0 -
SSL and robots.txt question - confused by Google guidelines
I noticed "Don’t block your HTTPS site from crawling using robots.txt" here: http://googlewebmastercentral.blogspot.co.uk/2014/08/https-as-ranking-signal.html Does this mean you can't use robots.txt anywhere on the site - even parts of a site you want to noindex, for example?
Intermediate & Advanced SEO | | McTaggart0 -
Moving popular blog from root to subdomain. Considerations & impact?
I'd like to move the popular company blog from /ecommerce-blog to blog.bigcommerce.com.WordPress application is currently living inside the application that runs the .com and is adding a large amount of files to the parent app, which results in longer deployment times than we'd like. We would use HTTP redirection to handle future requests (e.g. HTTP status code 301). How can this be handled from a WP point of view? What is the impact of SEO, rankings, links, authority? Thanks.
Intermediate & Advanced SEO | | fullstackmarketing.io0 -
Odd Results Moving Subdomain Content onto Main Domain
Hi forum! On Thursday night (12/6/12) we moved a page (and all the linking product pages) from our subdomain, mailing-list.consumerbase.com, to our main domain, www.consumerbase.com/mailing-lists.html Shockingly, today I search for "mailing lists" (our #1 target keyword) and we're on the first page! This page never has not ranked well for this keyword in the past. The problem is, the link displaying on Google is our old mailing-list.consumerbase.com subdomain URL. Did moving this content from the new subdomain to our old, well-established domain cause it to appear better in search? Or, since the URL is on the subdomain, did Google just finally get around to indexing that page? Thanks!
Intermediate & Advanced SEO | | Travis-W0 -
Better to have one subdomain or multiple subdomains?
We have quite a bit of content we are considering subdomaining. There are about 13 topic centers that could deserve their own subdomain, but there are about 2000 original articles that we also want to subdomain. We are considering a) putting the 13 centers (i.e. babies.domain.com, acne.domain.com, etc) and the 3000 articles (on a variety of topics) on one subdomain b) putting the 13 centers on their own subdomain and the remaining 3000 articles on their own subdomain as well (14 subdomains total) What do you think is the best solution and why?
Intermediate & Advanced SEO | | nicole.healthline0 -
Subdomain or subdirectory
We're a big social networking site with over 1 million indexed pages and over 4 million visits a month. Our PR is 7. We're about to acquire and rebrand the content of a large reviews website, current PR 3. The new content will be treated as a 'site within a site' with different navigation and interface. With these factors in mind I think we need to create a new subdomain for the reviews site but I need to factor in the SEO implications, bearing in mind that new advertisers are going to be looking closely at our stats. Migrating the content to a new subdomain I understand will be easier than siting it in a new folder. Any advice appreciated
Intermediate & Advanced SEO | | CecilyP0