International Sites and Duplicate Content
-
Hello,
I am working on a project where have some doubts regarding the structure of international sites and multi languages.Website is in the fashion industry. I think is a common problem for this industry. Website is translated in 5 languages and sell in 21 countries.
As you can imagine this create a huge number of urls, so much that with ScreamingFrog I cant even complete the crawling.
Perhaps the UK site is visible in all those versions
http://www.MyDomain.com/en/GB/
http://www.MyDomain.com/it/GB/
http://www.MyDomain.com/fr/GB/
http://www.MyDomain.com/de/GB/
http://www.MyDomain.com/es/GB/
Obviously for SEO only the first version is important
One other example, the French site is available in 5 languages and again...
http://www.MyDomain.com/fr/FR/
http://www.MyDomain.com/en/FR/
http://www.MyDomain.com/it/FR/
http://www.MyDomain.com/de/FR/
http://www.MyDomain.com/es/FR/
And so on...this is creating 3 issues mainly:
-
Endless crawling - with crawlers not focusing on most important pages
-
Duplication of content
-
Wrong GEO urls ranking in Google
I have already implemented href lang but didn't noticed any improvements. Therefore my question is
Should I exclude with "robots.txt" and "no index" the non appropriate targeting?
Perhaps for UK leave crawable just English version i.e. http://www.MyDomain.com/en/GB/, for France just the French version http://www.MyDomain.com/fr/FR/ and so on
What I would like to get doing this is to have the crawlers more focused on the important SEO pages, avoid content duplication and wrong urls rankings on local Google
Please comment
-
-
Hey Guido, don't know if it's the best solution, but could be a temporary fix until the best solution is in place. I suggest to move forward with proper HREF LANG tagging or definitely delete those irrelevant languages. Try to do what I said before about validate each country/language and submit a sitemap.xml reflecting that folder to see crawl and index stats pero country/language. Add a sitemap index and obviously validate your entire domain. Just block in the robots.txt unnecessary folders, like images, js libraries, etc. to save crawl budget to your domain.
Let me know if you have another doubt
-
Thank you Antonio, insightful and clear.
There is really not a need of EN versions of localized sites, I think has been done more as was easier to implement (original site is EN-US).
Don't you think robots and noindex EN version of localized sites could be the best solution? for sure is the easier one to implement without affecting UX.
-
Don't know why you have a UK oriented site for German and Italian people, I think is not important those languages in a country mainly English speaking (not US for example, there you must have a Spanish version, or in Canada for English and French). The owner must have their reasons.
Besides this, about your questions:
- If those non-relevant languages must live there, it's correct to implement HREF LANG (may take some time to show results). Also, if the domain is gTLD, you can validate all the subfolders in Google Search Console and choose the proper International targeting. With the ammount of languages and countries I imagine this might be a pain in the ***.
- About the crawling, for large sitesI recommend to crawl per language. If neccesary, per language-country. In this instance I recommend to create a sitemap XML per language or language-country for just HTML pages (hopefully dynamically updated by the e-commerce), create a Sitemap Index in the root of the domain and submit them in Google Search Console (better if you validated the languages or language-country). With this you can answer the question if some language or country are being not indexed with the Submited/Indexed stadistics of GSC.
- Maybe the robots.txt might save your crawl budget, but I'm not a fan of de-index if those folders are truly not relevant (after all, there should be a italian living in UK. If you can't delete the irrelevant langauges for some countries, this can be an option
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Web Site Migration - Time to Google indexing
Soon we will do a website migration .com.br to .com/pt-br. Wi will do this migration when we have with lower traffic. Trying to follow Google Guidelines, applying the 301 redirect, sitemap etc... I would like to know, how long time the Google generally will use to transfering the relevance of .com.br to .com/pt-br/ using redirect 301?
International SEO | | mobic0 -
International Confusion between .com and .com/us
A question regarding International SEO: We are seeing cases for many sites that meet these criteria: -International sites that have www.example.com/ ip redirecting to country site based on ip redirect (ex. www.example.com/ 301 to www.example.com/us -There is a desktop + mobile site (www.example.com + m.example.com) The issue we see is Google shows www.example.com/ in US search results instead of www.example.com/us in search results. Since the .com/ redirects, there is no mobile version, and www.example.com/ also shows up in mobile SERPs instead of m.example.com/us. My questions are: 1. If www.example.com/ is redirecting users and Googlebot, why is Googlebot caching it with the content of www.example.com/us? 2. Why is www.example.com/ showing up in SERPs instead of www.example.com/us? 3. How can we help Google display www.example.com/us and m.example.com/us in SERPs instead of www.example.com/? Thanks!!
International SEO | | FranFerrara0 -
Multilanguage duplicate content question
I have following situation; First site, in four languages
International SEO | | nans
Second site, in one language Let's say we have the following setup: www.domain1.be/nl (dutch)
www.domain1.be/fr (french)
www.domain1.be/en (english)
www.domain1.be/de (german) www.domain2.be/ (french only) Possible problem is the content on
www.domain1.be/fr
www.domain2.be
Content on domain2 is a copy of domain1/fr. So French content is duplicated. For domain1, the majority (80%) are Dutch speaking clients, domain2 is 100% French.
Both companies operate in same country, one in the north, the second one in the south. QUESTION; what about duplicate content?
Can we 'fix' that with using the canonical tag? Canonical on domain1 (fr pages), pointin to domain2? Or vice versa.
Domain1 is more important than domain2, but customers of domain2 should not be pointed to domain1. Anybody any advice?0 -
How to handle rel canonical on secondary TLD's - multi regional sites.
I currently have a .com domain which I am think of duplicating the content on to another tld, CO.UK (and regionalize some aspects like contact numbers etc). From my research, it seems that in gwt you must then indicate which country you wish to target, in the co.uk case the UK. My question is how should I handle rel canonical in the duplicated site. should it rel canonical back to the .com or the co.uk? Any other pointers would also be appreciated. Thx
International SEO | | dmccarthy0 -
Should I be deindexing pages with thin or weak content?
If I have pages that rank product categories by alphabetical order should I deindex those pages? Keeping in mind the pages do not have any content apart from product titles? For example: www.url.com/albums/a/ www.url.com/albums/b/ If I deindexed these pages would I lose any authority passed through internal linking?
International SEO | | Jonathan_Hatton0 -
Do you recommend for registering international domains (IDN) for ranking on words used in domain name?
Hi everybody, thanks for putting time to reply me 🙂 I am working on SEO of a website that its content is in Farsi. I have chosen a few rather competitive keywords (difficulty between 30-40% :thanks to KDT in seomoz!) to target. Due to the importance of keyword in domain name I was thinking about registering a few international domains that contain exactly the same Farsi words that I target. Do you recommend this as a valid approach? For each of these domains, I am going to setup a very simple 1 page site for each domain, a few lines of content and a big button linked to my primary website. How does it sound? Best regards,
International SEO | | Ashkan10 -
Best domain for spanish language site targeting ALL spanish territories?
hi, we're have a strong .com domain and are looking to launch a site for spanish speakers (ie latin america + spain). we already have various subdirectories for some foreign language sites (eg. ourdomain.co.uk, us.ourdomain.com, ca.ourdomain.com, ourdomainchina.com, ourdomainindia.com etc) we already have a B2B site ourdomain.com-es which will remain the same. I'm thinking best practice would be to launch translated copy for the following: ourdomain.com/es ourdomain.com/cl ourdomain.com/mx ourdomain.com/pt etc etc firstly is this the best option? secondly, i'm really interested to hear whether there is a less time/resource intensive route that would give us visibility in ALL spanish speaking territories? Also - if we go with just one of the above (eg ourdomain.com/cl) how likely are we to get traction in other spanish speaking territories? any help much appreciated!
International SEO | | KevinDunne0 -
International SEO with .com & ccTLD in the same language
I've watched http://www.seomoz.org/blog/intern... and read some other posts here. Most seem to focus on whether to use ccTLD, subdomains or subfolders. I'm already committed to expanding my US-based ecommerce to Canada with a .ca ccTLD. My question is around duplicate content as I take my .com USA ecommerce business to canada with a second site on a .ca URL. With the .com site's preference set to USA, and the .ca site's geo preference (automatically) set to Canada, is it a concern at all? About 80% of the content would be the same. FYI, .com ranks OK in Canada now and I want .ca to outrank it in Canada. I know 'localizing' content within the same language is important (independent of duplicate content), but this might not be viable in the short run given CMS limitations. Any direct experience to help quantify the impact here between US and Canadian ecommerce? Adding: I'm not totally confident here. From this google webmaster central post it seems that canonical tags aren't needed. I tend to think nothing is truly neutral and want to be confident regarding whether to use canonicals or not. Is it helpful, harmful or harmless? My site already has internal canonical tags and having internal and external would be a pain I think. @Eugene Byun used it successfully, but would the results have been the same without? Thanks!
International SEO | | gravityseo0