Hello,
I am working on a project where have some doubts regarding the structure of international sites and multi languages.Website is in the fashion industry. I think is a common problem for this industry. Website is translated in 5 languages and sell in 21 countries.
As you can imagine this create a huge number of urls, so much that with ScreamingFrog I cant even complete the crawling.
Perhaps the UK site is visible in all those versions
http://www.MyDomain.com/en/GB/
http://www.MyDomain.com/it/GB/
http://www.MyDomain.com/fr/GB/
http://www.MyDomain.com/de/GB/
http://www.MyDomain.com/es/GB/
Obviously for SEO only the first version is important
One other example, the French site is available in 5 languages and again...
http://www.MyDomain.com/fr/FR/
http://www.MyDomain.com/en/FR/
http://www.MyDomain.com/it/FR/
http://www.MyDomain.com/de/FR/
http://www.MyDomain.com/es/FR/
And so on...this is creating 3 issues mainly:
-
Endless crawling - with crawlers not focusing on most important pages
-
Duplication of content
-
Wrong GEO urls ranking in Google
I have already implemented href lang but didn't noticed any improvements. Therefore my question is
Should I exclude with "robots.txt" and "no index" the non appropriate targeting?
Perhaps for UK leave crawable just English version i.e. http://www.MyDomain.com/en/GB/, for France just the French version http://www.MyDomain.com/fr/FR/ and so on
What I would like to get doing this is to have the crawlers more focused on the important SEO pages, avoid content duplication and wrong urls rankings on local Google
Please comment