Multilingual site with untranslated content
-
We are developing a site that will have several languages.
There will be several thousand pages, the default language will be English. Several sections of the site will not be translated at first, so the main content will be in English but navigation/boilerplate will be translated.
We have hreflang alternate tags set up for each individual page pointing to each of the other languages, eg in the English version we have:
etc
In the spanish version, we would point to the french version and the english version etc.
My question is, is this sufficient to avoid a duplicate content penalty for google for the untranslated pages?
I am aware that from a user perspective, having untranslated content is bad, but in this case it is unavoidable at first.
-
Thanks for your comments Gianluca.
I think Google's guidelines are somewhat ambiguous. Here it does state that "if you're providing the same content to the same users on different URLs (for instance, if both example.de/ and example.com/de/ show German language content for users in Germany), you should pick a preferred version and redirect (or use the rel=canonical link element) appropriately."
https://support.google.com/webmasters/answer/182192?hl=en
I think you've explained it nicely though.
-
At first that would be fine.
Said that, this is a very specific case where you can use both hreflang and cross domain rel="canonical".
Remember that these two mark-up are totally independent one each other, though.
If you use them both, as I wrote replying to Yusuf, from one side you are telling Google that you want it to show a determined URL for a determined geo-targeted country/language, and from other side you are also telling Google that that geo-targeted URL is the exact copy of the canonical one.
What Google will do will be showing the geo-targeted URL in the SERPs, but with the Title and Meta Description of the canonical one.
One more thing, and this a strong reason for urging a complete translation in a short period of time:
if the content of the URL of the French site, for instance, is in English, you cannot put "fr-FR" in the hreflang, but "en-FR". This is a consequence: that the URL will tend to be shown only for English queries done in Google.fr, not for French queries... and that mean loosing a lot of traffic opportunities.
-
Yusuf,
I'm sorry but I've to correct you.
If two pages are in the same language, but they are targeting different countries (i.e.: USA and UK), even if the content is the same or substantially the same, then you not only can use the hreflang, but also you should use it in order to tell Google that one URL must be shown to US people and the other to UK ones.
Obviously, if you want you can always decide to use the cross domain rel="canonical" instead.
Remember, though, that in that case - if you are using the hreflang - that Google will show the snippets' components (title and meta description) of the canonical URL, even it will show the geotargeted URL. Instead, if you opted to not use the hreflang, people will see the canonical URL snippet (web address included).
-
Have you taken a look through the following :
https://support.google.com/webmasters/answer/182192?hl=en#1
https://sites.google.com/site/webmasterhelpforum/en/faq-internationalisation
"
Duplicate content and international sites
Websites that provide content for different regions and in different languages sometimes create content that is the same or similar but available on different URLs. This is generally not a problem as long as the content is for different users in different countries. While we strongly recommend that you provide unique content for each different group of users, we understand that this may not always be possible. There is generally no need to "hide" the duplicates by disallowing crawling in a robots.txt file or by using a "noindex" robots meta tag. However, if you're providing the same content to the same users on different URLs (for instance, if both
example.de/
andexample.com/de/
show German language content for users in Germany), you should pick a preferred version and redirect (or use the rel=canonical link element) appropriately. In addition, you should follow the guidelines on rel-alternate-hreflang to make sure that the correct language or regional URL is served to searchers." -
Hi Jorge
The rel="alternate" hreflang="x" tag is not suitable for pages that are in the same language as these are essentially duplicates rather than alternative language versions.
I'd use the rel="canonical" tag to point to the main page until the translations of those pages are available.
Webmaster Tools should allow you to see any issues.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Links to Paywall from Content Pages
Hi, My site is funded by subscriptions. We offer lengthy excerpts, and then direct people to a single paywall page, something like domain.com/subscribe/ This means that most pages on the site links to /subscribe, including all of the high value pages that bring people in from Google. This is a page with an understandably high bounce rate, as most users are not interested in paying for content on the web. My question is are we being penalized in Google for having so many internal links to a page with a very high bounce rate? If anyone has worked with paywall sites before and knows the best practices for this, I'd be really grateful to learn more.
On-Page Optimization | | enotes0 -
How to check duplicate content with other website?
Hello, I guest that my website may be duplicate contents with other websites. Is this a important factor on SEO? and how to check and fix them? Thanks,
On-Page Optimization | | JohnHuynh1 -
Large Site - Advice on Subdomaining
I have a large news site - over 1 million pages (have already deleted 1.5 million) Google buries many of our pages, I'm ready to try subdomaining http://bit.ly/dczF5y There are two types of content - news from our contributors, and press releases. We have had contracts with the big press release companies going back to 2004/5. They push releases to us by FTP or we pull from their server. These are then processed and published. It has taken me almost 18 months, but I have found and deleted or fixed all the duplicates I can find. There are now two duplicate checking systems in place. One runs at the time the release comes in and handles most of them. The other one runs every night after midnight and finds a few, which are then handled manually. This helps fine-tune the real-time checker. Businesses often link to their release on the site because they like us. Sometimes google likes this, sometimes not. The news we process is reviews by 1,2 or 3 editors before publishing. Some of the stories are 100% unique to us. Some are from contributors who also contribute to other news sites. Our search traffic is down by 80%. This has almost destroyed us, but I don't give up easily. As I said, I've done a lot of projects to try to fix this. Not one of them has done any good, so there is something google doesn't like and I haven't yet worked it out. A lot of people have looked and given me their ideas, and I've tried them - zero effect. Here is an interesting and possibly important piece of information: Most of our pages are "buried" by google. If I dear, even for a headline, even if it is unique to us, quite often the page containing that will not appear in the SERP. The front page may show up, an index page may show up, another strong page pay show up, if that headline is in the top 10 stories for the day, but the page itself may not show up at all - UNTIL I go to the end of the results and redo the search with the "duplicates" included. Then it will usually show up, on the front page, often in position #2 or #3 According to google, there are no manual actions against us. There are also no notices in WMT that say there is a problem that we haven't fixed. You may tell me just delete all of the PRs - but those are there for business readers, as they always have been. Google supposedly wants us to build websites for readers, which we have always done, What they really mean is - build it the way we want you to do it, because we know best. What really peeves me is that there are other sites, that they consistently rank above us, that have all the same content as us, and seem to be 100% aggregators, with ads, with nothing really redeeming them as being different, so this is (I think) inconsistent, confusing and it doesn't help me work out what to do next. Another thing we have is about 7,000+ US military stories, all the way back to 2005. We were one of the few news sites supporting the troops when it wasn't fashionable to do so. They were emailing the stories to us directly, most with photos. We published every one of them, and we still do. I'm not going to throw them under the bus, no matter what happens. There were some duplicates, some due to screwups because we had multiple editors who didn't see that a story was already published. Also at one time, a system code race condition - entirely my fault, I am the programmer as well as the editor-in-chief. I believe I have fixed them all with redirects. I haven't sent in a reconsideration for 14 months, since they said "No manual spam actions found" - I don't see any point, unless you know something I don't. So, having exhausted all of the things I can think of, I'm down to my last two ideas. 1. Split all of the PRs off into subdomains (I'm ready to pull the trigger later this week) 2. Do what the other sites do, that I believe create little value, which is show only a headline and snippet and some related info and link back to the original page on the PR provider website. (I really don't want to do this) 3. Give up on the PRs and delete them all and lose another 50% of the income, which means releasing our remaining staff and upsetting all of the companies and people who linked to us. (Or find them all and rewrite them as stories - tens of thousands of them) and also throw all our alliances under the bus (I really don't want to do this) There is no guarantee this is the problem, but google won't tell me, the google forums are crap, and nobody else has given me an idea that has helped. My thought is that splitting them off into subdomains will have a number of effects. 1. Take most of the syndicated content onto subdomains, so its not on the main domain. 2. Shake up the Domain Authority 3. Create a million 301 redirects. 4. Make it obvious to the crawlers what is our news and what is PRs 5. make it easier for Google News to understand Here is what I plan to do 1. redirect all PRs to their own subdomain. pn.domain.com for PRNewswire releases bw.domain.com for Businesswire releases etc 2. Fix all references so they use the new subdomain Here are my questions - and I hope you may see something I haven't considered. 1. Do you have any experience of doing this? 2. What was the result 3. Any tips? 4. Should I put PR index pages on the subdomains too? I was originally planning to keep them on the main domain, with the individual page links pointing to the actual release on the subdomain. Obviously, I want them only in one place, but there are two types of these index pages. a) all of the releases for a particular PR company - these certainly could be on the subdomain and not on the main domain b) Various category index pages - agriculture, supermarkets, mining etc These would have to stay on the main domain because they are a mixture of different PR providers. 5. Is this a bad idea? I'm almost out of ideas. Should I add a condensed list of everything I've done already? If you are still reading, thanks for hanging in.
On-Page Optimization | | loopyal0 -
Duplicate content with a trailing slash /
Hi, I 've pages like this: A) www.example.com/file/ B) www.example.com/file Just two questions: Does Google see this as duplicate content? Best to 301 redirect B to A? Many thanks Richard PS I read previous threads re the subject, it sounded like there was a bug in SEOMoz but I was not absolutely clear. Apologies if this is going over old ground.
On-Page Optimization | | Richard5550 -
Site Wide Link
I have just run up the link explorer on my site and discovered that every page home page link points back with the text home - I assume this is bad in terms of SEO , my site name is ccie and I assumed that it put the site wide link of ccie to the entire site, however it seems to be the breadcrumb default of home which is doing it/. www.rogerperkin.co.uk/ccie Should I be looking to change this so my top keyword points back from each page to the home page. I am running wordpress and assumed the site name was the home link on all pages. Can anyone advise the best practice? Thanks
On-Page Optimization | | rogerp0070 -
Linking within Secondary Site
So we've got a secondary site that has quite a bit of authority & links that used to have all types of info on parasailing. All those pages are gone and homepage is now a salespage (management decision, not mine) Our main site sells a wide range of tours and activities and does have a page for parasailing. The secondary site uses the same template/navigation as our main site (again, not my decision). Do you think that's an effective way to send link juice to our main site? The secondary site has some pretty awesome high authority sites linking to it. I've considered 301'ing the whole site to our main site but it's got a really solid domain name and I'd like to take up 2 SERP listings (main and secondary site) Is there a better way to have double listings but still send a good amount of link juice?
On-Page Optimization | | SoulSurfer80 -
How to avoid content duplication of my websites
Hello, We are having 4 domains abc.com, abc.in, abc.co.uk, abc.com.au with same content and same inner pages (abc.com/page1, abc.in/page1 etc.) targeting on different geographical areas. How can we avoid duplicate content issue in the home page as well as in the inner pages. Abc.com is the major site. Thank you.
On-Page Optimization | | semvibe0 -
Archetecture to avoid content duplicate
Hi, I have lots of duplicate stuff and I need a better site architecture. http://www.furnacefilterscanada.com/ We are selling furnace filters. All furnace filters are sold in 50 different sizes, each sizes comes in 3 different qualities, Bronze, Silver and Gold. Total: 150 products. Right now I have created many categories and subcategories for furnace filters sizes. When the client pickup is sizes, he will end-up to the products page with 3 different options, Bronze, Silver and Gold. They can then compare the filter a select the one he wants to purchase. The problem is, it is not possible to provide different content for each filters, Gold has a description, Silver has another one and also Bronze. The only text that will change in the descriptions, is the filter size. This makes Duplicates text description. Not good when you what to index your site. The positive things to 150 different products, is the page title. example 16x25x4 furnace filters. Those exacte tem get search in Google. A new site architecture with 3 categories, Gold, Silver and Bronze & 50 variables by products (filters sizes) might not be the best options, because no filter size will be index. Can you please help me to find the best architecture in a SEO point of view? Also what about the top navigation bar menu, what is the best options in using it? Right now it is use for Legal, Contact, Policy and I fill it is a wast, those page only get less then 1% clicks. It might be more convenient to use those for categories for example, what is your recommendations in a SEO point of view? Can I create a information page in the left navigation menu and includ all the standard page, like: Policy, Legal ... If I do, will I get penalize by Google? Thank you for your help. We have puts lots of money in AdWords before, but now the next step is to come home organics. I'm using SEOmoz tools, read there new book, and I want increase traffic. I just need your help. Thank you, BigBlaze
On-Page Optimization | | BigBlaze2050