Block subdomain directory in robots.txt
-
Instead of block an entire sub-domain (fr.sitegeek.com) with robots.txt, we like to block one directory (fr.sitegeek.com/blog).
'fr.sitegeek.com/blog' and 'wwww.sitegeek.com/blog' contain the same articles in one language only labels are changed for 'fr' version and we suppose that duplicate content cause problem for SEO. We would like to crawl and index 'www.sitegee.com/blog' articles not 'fr.sitegeek.com/blog'.so, suggest us how to block single sub-domain directory (fr.sitegeek.com/blog) with robot.txt?
This is only for blog directory of 'fr' version even all other directories or pages would be crawled and indexed for 'fr' version.
Thanks,
Rajiv -
Hi Rajiv,
If you post the same content on both FR & EN version:
-
if both are written in English (or mainly written in English) - best option would be to have a canonical pointing to the EN version
Example: https://fr.sitegeek.com/category/shared-hosting - most of the content is in English - so in this case I would point a canonical to the EN version -
if the FR version is in French - you can use the HREF lang tag - you can use this tool to generate them, check here for common mistakes and doublecheck the final result here.
Just some remarks:
-
partially translated pages offer little value for users - so it's best to fully translate them or only refer to the EN version
-
I have a strong impression that the EN version was machine translated to the FR version. (ex. French sites never use 'Maison' to link to the Homepage - they use Acceuil). Be aware that Google is perfectly capable to detect auto-translated pages and they consider it to be bad practice (check this video of Matt Cutts - starts at 1:50). So you might want to invest in proper translation or proofreading by a native French speaker.
rgds
Dirk
-
-
Thanks Dirk,
we will fix the issue as you suggested.
Could you explain more on duplicate content if we post articles on both 'FR' and 'EN' versions?
Thanks,
Rajiv
-
Just to add to this, if your subdomain has more than /blog on it, and you only want to block /blog, change Dirk's robots.txt to:
User-agent: Googlebot
Disallow: /blogor to block more than just google:
User-agent:*
Disallow: /blog -
The easiest way would be to put the robots.txt in the root of your subdomain & block access for search engines
User-agent: Googlebot
Disallow: /If you subdomain & the main domain are sharing the same root - this option is not possible. In that case, rather than working with robots.txt I would add a canonical on each page pointing to the main domain, or block all pages in the header (if this is technically possible)
You could also check these similar questions: http://moz.com/community/q/block-an-entire-subdomain-with-robots-txt and http://moz.com/community/q/blocking-subdomain-from-google-crawl-and-index - but the answers given are the same as the options above.
Apart from the technical question, qiven the fact that only the labels are translated, these pages make little sense for human users. It would probably make more sense to link to the normal (English) version of the blog (and put (en Anglais) next to the link.
rgds,
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Orphan Duplicate is created as Subdomain in Google Search
We noticed that some of our results on google for the blog are also come up with subdomain that is not linked from anywhere on the website. For example: SUBDOMAIN1.website.com/blog/content.html -> it redirects to website.com/blog/content.html SUBDOMAIN1 is not linked anywhere on the website. How did the google find it in the first place? Why does it still keep it in the search results? How do you get rid of it?
Intermediate & Advanced SEO | | rkdc0 -
Meta Robot Tag:Index, Follow, Noodp, Noydir
When should "Noodp" and "Noydir" meta robot tag be used? I have hundreds or URLs for real estate listings on my site that simply use "Index", Follow" without using Noodp and Noydir. Should the listing pages use these Noodp and Noydr also? All major landing pages use Index, Follow, Noodp, Noydir. Is this the best setting in terms of ranking and SEO. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
How to handle a blog subdomain on the main sitemap and robots file?
Hi, I have some confusion about how our blog subdomain is handled in our sitemap. We have our main website, example.com, and our blog, blog.example.com. Should we list the blog subdomain URL in our main sitemap? In other words, is listing a subdomain allowed in the root sitemap? What does the final structure look like in terms of the sitemap and robots file? Specifically: **example.com/sitemap.xml ** would I include a link to our blog subdomain (blog.example.com)? example.com/robots.xml would I include a link to BOTH our main sitemap and blog sitemap? blog.example.com/sitemap.xml would I include a link to our main website URL (even though it's not a subdomain)? blog.example.com/robots.xml does a subdomain need its own robots file? I'm a technical SEO and understand the mechanics of much of on-page SEO.... but for some reason I never found an answer to this specific question and I am wondering how the pros do it. I appreciate your help with this.
Intermediate & Advanced SEO | | seo.owl0 -
Robots.txt
What would be a perfect robots.txt file my site is propdental.es Can i just place: User-agent: * Or should i write something more???
Intermediate & Advanced SEO | | maestrosonrisas0 -
Block Level Link Juice
I need a better understanding of how links in different parts of the page pass juice. Much has been written about how footer links pass less juice than other parts of the page. The question I have is that if a page has a hypothetical 1000 points of Link Juice and can pass on +/-800 points via links, and I have 1 and only 1 link in the footer to another page, does it pass the full 800 points? Or... since footers only pass a small fraction of link juice, it passes lets say 80 points, and the other 720 points stays locked up on the page. This question is a hypothetical - I'm just trying to understand relationships. I don't know if I've explained the question too well, but if someone could answer i it, or point me in the right direction, I would appreciate it.
Intermediate & Advanced SEO | | CsmBill0 -
Folder or subdomain for new e-commerce addition
Our main content site has 5K pieces of unique content all targeting our market. We are planning to add e-commerce as a source monetizing our audience. Should we place the new commerce platform within a subdirectory or subdomain? The layout we are considering is... shop name: Brand Name Market http://www.brandname.com/market/ http://market.brandname.com I am also considering something like: http://www.brandname.com/market/ aggregating product details and content from http://market.brandname.com/ with rel= back to the subdomain if possible.
Intermediate & Advanced SEO | | ejovi0 -
Could Temporarily Linking New Directory Pages to my Homepage Help SEO?
Within my website we maintain a nationwide directory of auto repair shops. When we add or significantly update / modify a particular listing, would it help improve the individual search engine rankings, Google PageRank, and / or Page Authority of the new auto shop page if we linked these pages to an area on the home page for "Our Newest Featured Shops" or "Latest Member Additions" or something of the nature? Each new shop profile would then be linked directly from the homepage for a period of time. I assume that it might be crawled and added to the indexes quicker, but would there be other benefits? If so, would those benefits only be temporary if eventually the new shop no longer linked to the homepage? Would keeping all featured shops in rotational display on the homepage make any difference? Any input is appreciated. Thanks. Kelly Vaught
Intermediate & Advanced SEO | | kelly_vaught0 -
Competitors and Directory Links
Hi guys, wanted to get some input and thoughts here. I'm analyzing many competitor links for a specific client (even other clients actually as well) and come across a pretty heavy directory backlink profiles. has anyone here had success with directory listings? Seem many of the competitors backlinks are coming from directories. What say you?
Intermediate & Advanced SEO | | PaulDylan1