Multiple Countries, Same Language: Receiving Duplicate Page & Content Errors
-
Hello!
I have a site that serves three English-speaking countries, and is using subfolders for each country version:
- United Kingdom: https://site.com/uk/
- Canada: https://site.com/ca/
- United States & other English-speaking countries: https://site.com/en/
The site displayed is dependent on where the user is located, and users can also change the country version by using a drop-down flag navigation element in the navigation bar. If a user switches versions using the flag, the first URL of the new language version includes a language parameter in the URL, like:
In the Moz crawl diagnostics report, this site is getting dinged for lots of duplicate content because the crawler is finding both versions of each country's site, with and without the language parameter.
However, the site has rel="canonical" tags set up on both URL versions and none of the URLs containing the "?language=" parameter are getting indexed.
So...my questions:
1. Are the Duplicate Title and Content errors found by the Moz crawl diagnostic really an issue?
2. If they are, how can I best clean this up?
Additional notes: the site currently has no sitemaps (XML or HTML), and is not yet using the hreflang tag. I intend to create sitemaps for each country version, like:
- .com/en/sitemap.xml
- .com/ca/sitemap.xml
- .com/uk/sitemap.xml
I thought about putting a 'nofollow' tag on the flag navigation element, but since no sitemaps are in place I didn't want to accidentally cut off crawler access to alternate versions.
Thanks for your help!
-
Yep, given your resource constraints, I'd focus on translations for now. If you ever get to a point that there is something bigger than price differentiating your content, then you can think about geo-targeting. You will need the resources to differentiate the content though.
Right now, my recommendation is to drop the country specific content and just offer English for now. Your content can rank for any English speaking search, regardless of country. However, if the terms people use in the US, UK and Canada differ that much, you can "translate" the content (en-us, en-gb, en-au) and use the HREFLANG tag.
For price changes, that's tricker, but do you offer the price in search results via schema? Does it show up? If not, then you can use cookies to set the prices dependent on the country the person chooses (try not to use IP address, and if you do, make people confirm the setting).
For now, focus your time and efforts getting the flow right for the user. Only worry about HREFLANG if your English content needs to be differentiated for term usage. Then focus your efforts on getting those upcoming translations right. When that is ready, then really use HREFLANG.
Hope that helps!
-
Hi Kate,
Nifty quiz and flowchart! Thanks for sharing it. All the countries targeted are English-speaking, though further expansion to non-English speaking countries is planned for 2015. Here are the answers to the questions:
1. Does your business/product/content change in different countries?
A: Not really. 90% of the products are available in all three countries, and only one country is currently lacking the remaining 10%, and it will start selling those products there in 2015.
2. Would it make sense to an international visitor to see different site content? (ex. currency, localization, etc.)
A: Currency - yes. Otherwise, not really.
3. Do you have the resources to differentiate the content?
A: Not currently. This is a set of branded products, and the product descriptions use extensive "on-brand" language.
4. Are there multiple official languages for any of these countries?
A: Yes, Canada's official languages are English and French. There is no French version currently available.
5. Do you plan on offering the site content in all official languages?
A: Next six months - no. Late 2015 - maybe.
Going through the quiz, if I answer:
1. No, 2. Yes, 3. No
This is the recommendation:
Your International Strategy is:
Translate Only
- Don’t machine translate, while manual translation is costly, it’s the best for your goals.
- Put your HREFLANG in XML sitemaps.
- Use the Language Meta tag for Bing translation targeting.
- Don’t use a ccTLD. That is for Geo-Targeting only.
Aside from the manual translation portion, do you think #2 and #3 are still the best solutions for this situation?
Thanks for your help!
-
Hi!
This is a tough one because I can't tell if you mean to be geo-targeting or translate. It's not a one or the other thing, but it usually is when you are just targeting english speaking countries. Can you do me a favor and go to http://www.katemorris.com/issg/ and go through the questions there? Let me know what the "answer" is for your situation and I'll help you get to the right solution.
But in short, yes, the duplicate content is a real issue with or without the lang parameter.
Let me know!
-
Oh this is a tough one. The problem is that no matter the tags and language, the content is the same. It is reflecting duplicate content because it is duplicate content. Duplicate content within your site is serious, especially if you are trying to target keywords on those pages.
The hreflang tags should help you be able to display languages without using so many duplicate pages. I don't have much experience with that tag, but my advice would be to look into it further to help with your duplicate content issue. No following the duplicate pages will ultimately effect their rankings, so that probably isn't the best thing to do.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have an issue with hubspot's blog platform and duplicate content.
It is redirecting all https to http and I am unable to change it, resulting in a lot of duplicate content. Has anyone else experienced this? If so, did you find a solution? Or does anyone have any suggestions?
Moz Pro | | KurtzGro0 -
Large site with content silo's - best practice for deep indexing silo content
Thanks in advance for any advice/links/discussion. This honestly might be a scenario where we need to do some A/B testing. We have a massive (5 Million) content silo that is the basis for our long tail search strategy. Organic search traffic hits our individual "product" pages and we've divided our silo with a parent category & then secondarily with a field (so we can cross link to other content silo's using the same parent/field categorizations). We don't anticipate, nor expect to have top level category pages receive organic traffic - most people are searching for the individual/specific product (long tail). We're not trying to rank or get traffic for searches of all products in "category X" and others are competing and spending a lot in that area (head). The intent/purpose of the site structure/taxonomy is to more easily enable bots/crawlers to get deeper into our content silos. We've built the page for humans, but included link structure/taxonomy to assist crawlers. So here's my question on best practices. How to handle categories with 1,000+ pages/pagination. With our most popular product categories, there might be 100,000's products in one category. My top level hub page for a category looks like www.mysite/categoryA and the page build is showing 50 products and then pagination from 1-1000+. Currently we're using rel=next for pagination and for pages like www.mysite/categoryA?page=6 we make it reference itself as canonical (not the first/top page www.mysite/categoryA). Our goal is deep crawl/indexation of our silo. I use ScreamingFrog and SEOMoz campaign crawl to sample (site takes a week+ to fully crawl) and with each of these tools it "looks" like crawlers have gotten a bit "bogged down" with large categories with tons of pagination. For example rather than crawl multiple categories or fields to get to multiple product pages, some bots will hit all 1,000 (rel=next) pages of a single category. I don't want to waste crawl budget going through 1,000 pages of a single category, versus discovering/crawling more categories. I can't seem to find a consensus as to how to approach the issue. I can't have a page that lists "all" - there's just too much, so we're going to need pagination. I'm not worried about category pagination pages cannibalizing traffic as I don't expect any (should I make pages 2-1,000) noindex and canonically reference the main/first page in the category?). Should I worry about crawlers going deep in pagination among 1 category versus getting to more top level categories? Thanks!
Moz Pro | | DrewProZ1 -
Since July 1, we've had a HUGE jump in errors on our weekly crawl. We don't think anything has changed on our website. Has MOZ changed something that would account for a large leap in duplicate content and duplicate title errors?
Our error report went from 1,900 to 18,000 in one swoop, starting right around the first of July. The errors are duplicate content and duplicate title, as if it does not see our 301 redirects. Any insights?
Moz Pro | | KristyFord0 -
Page Rank Report says #6 in Google but I can't find the page anywhere
So SEOMoz says that I've consistently ranked #6 for a certain keyword. But when I search I'm no where to be found. I've done regular searches, incognito and some non-seomoz reports and all come up with nothing in Google. I noticed it a week or two ago, but didn't think it would continue. This is no bueno. I wouldn't be surprised if I got penalized (luckily my homepage relatively well for similar keywords), an old seo consultant used very spammy tactics. I recently removed them, but not before I started to notice that I fell off the map. Why would SEOMoz not recognize this, and continue to say I'm ranking well? The keyword is bpi building analyst the page is http://www.cleanedison.com/courses/bpi-building-analyst
Moz Pro | | CleanEdisonInc0 -
SEOmoz duplicate content checker
From my reports in seomoz i can see pages that are showing as having duplicate content but when i click on them it does not show me which pages are carrying the duplicate content? Is there any way to check this via semoz reports?
Moz Pro | | jazavide0 -
Member Only Content
I run a wordpress based website that contains a large amount of free content, but also a large amount of content that is only accessed via a paid membership. After running a SEOmoz campaign for the site, it showed 3600 errors for duplicate page titles and 1900 errors for duplicate page content. After looking into the errors it became clear that the majority of them were due to the fact that if you clicked on a link to paid content, it would take you to the paid membership sign in page. So how to I go about fixing these errors? I don't want this to hurt my rankings. Or fix it if it already has.
Moz Pro | | CobraJones950 -
Roger keeps telling me my canonical pages are duplicates
I've got a site that's brand spanking new that I'm trying to get the error count down to zero on, and I'm basically there except for this odd problem. Roger got into the site like a naughty puppy a bit too early, before I'd put the canonical tags in, so there were a couple thousand 'duplicate content' errors. I put canonicals in (programmatically, so they appear on every page) and waited a week and sure enough 99% of them went away. However, there's about 50 that are still lingering, and I'm not sure why they're being detected as such. It's an ecommerce site, and the duplicates are being detected on the product page, but why these 50? (there's hundreds of other products that aren't being detected). The URLs that are 'duplicates' look like this according to the crawl report: http://www.site.com/Product-1.aspx http://www.site.com/product-1.aspx And so on. Canonicals are in place, and have been for weeks, and as I said there's hundreds of other pages just like this not having this problem, so I'm finding it odd that these ones won't go away. All I can think of is that Roger is somehow caching stuff from previous crawls? According to the crawl report these duplicates were discovered '1 day ago' but that simply doesn't make sense. It's not a matter of messing up one or two pages on my part either; we made this site to be dynamically generated, and all of the SEO stuff (canonical, etc.) is applied to every single page regardless of what's on it. If anyone can give some insight I'd appreciate it!
Moz Pro | | icecarats0 -
Port 80 and Duplicate Content
The SEOmoz Web App is showing me that every single URL on one of my clients' domains has a duplicate in the form of the URL + :80. For instance, the app is showing me that www.example.com/default.aspx is duplicated in the form of www.example.com:80/default.aspx Any idea if this is an actual problem or just some kind of reporting error? Any help would be appreciated.
Moz Pro | | AnthonyMangia0