Duplicated Content with joomla multi language website
-
Dear Seomoz Community
I am running a multi language joomla website (www.siam2nite.com) with 2 active languages.
The first and primary language is english. the second language is thai. Most of the content (articles, event descriptions ...) is in english only.
What we did is a thai translation for the navigation bars, headers, titles etc (translation of all joomla language files) those texts are static and only help the user navigate / understand our site in their thai language.
Now I facing a problem with duplicated content. Lets take our Q&A component as example.
the url structure looks like this:
english - www.siam2nite.com/en/questions/
thai - www.siam2nite.com/th/questions/
Every question asked will create two URL, one for each language. The content itself (user questions & answers) is identical on both URL's. Only the GUI language is different. If you take a look at this question you will understand what i mean:
ENGLISH VERSION:
http://www.siam2nite.com/en/questions/where-to-celebrate-halloween-in-bangkok
THAI VERSION:
http://www.siam2nite.com/th/questions/where-to-celebrate-halloween-in-bangkok
As you can see each page has a unique title (H1) and introduction text in the correct language (same for menu, buttons, etc.) but the questions and answers are only available in one language.
Now my question
I guess Google will see this pages as duplicated content. How should I proceed with this problem:
- put all thai links /th/questions/ in the robots.txt and block them
or
- make a canonical tag for the english versions?
Not sure if I set a canonical tag google will still index the thai title and introduction texts (they have important thai keywords in them)
Would really appreciate your help on this
Regards,
Menelik
-
Hi John
Sorry for my late response ;-(
Thank you very much for your help. I added a rel=alternate for the Thai version as well. So far it looks good - no duplicated content.
Regards,
Menelik
-
The Google Webmaster set up sounds right to me!
You should set the rel alternate on all pages that go back and forth, not just the English pages. That way if Google wants to return a Thai page to an English searcher, it'll know to reference the English page. This is the set up Google recommends in their help documentation.
Don't worry about a new sitemap for the /th/ pages. Your current set up should be fine.
-
Hi John
Thank you very much for your answer. I did not know about the rel=alternate tag until today
Following your advise I modified the joomla header and now on every english page /en/... their is a rel=alternate link to the thai version.
for example:
http://www.siam2nite.com/en/magazine now has the following tag:
<link href="http://www.siam2nite.com/th/magazine" hreflang="th" rel="alternate">
Regarding the webmaster help (link you mentioned) I do not need to set a tag on the thai pages targeting the english ones correct? Just one rel=alternate on the english pages should make it right?
I tried to follow your advise with Google webmaster as well. My current configuration looks like this:
My old already existing site:
1 Site: www.siam2nite.com (no geo-targeting)
Today I created a new one
2. Site: www.siam2nite.com/th/ (geo-targeting: Thailand)
Is this the setup you meant in your answer?
I did not submit a sitemap for the 2nd site as all links (thai and english) are already included in the sitemap I use on the 1 site. Should I split my old sitemap and submit one for each site containing only the correct language links?
Thank you very much for your kind support - really appreciate it
-
The proper way to handle this is with rel=alternate hreflang tags. This will tell Google the content is the same, but in different languages. See http://support.google.com/webmasters/bin/answer.py?hl=en&answer=189077 for more info. You can place meta tags on each page, or do it in your sitemap.
Other things you can do to help search engines get it right is to set up a profile in Google Webmaster Tools for each of the directories (or at least for the Thai one), and set the geotargeting. For Bing, they prefer you set the country and language on each page (see here).
If you block the pages with robots.txt or use canonical tags, you're telling Google not to include those pages in SERPs. It sounds like you want the Thai pages to appear in Thai results, and the English pages in English SERPs, so I wouldn't do that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Online classified ads site - duplicate content?
Hello, I was reading hobo s post on duplicate content. Our web is in the classified advertisement industry and our site is built up like this Homepage (last 200 ads) category 1(has the name we want to rank our homepage and around 350 ads) category 2 (around 100 ads) category 3 (around 60 ads) Now our homepage has 200 ads that also appear mostly in category 1 but also in others. We are ranking our homepage as 11 th now on Google. I'm worried a bit that the 200 ads on the homepage are not unique, because they will appear in one other category. Is this OK? Is this duplication? Should we do something? Issue is that we at first started ranking our homepage where all ads were, now there are too many so we show 200 latest on homepage and then they are split into category pages.
On-Page Optimization | | advertisingcloud0 -
Website server errors
I launched a new website at www.cheaptubes.com and had recovered my search engine rankings as well after penguin & panda devestation. I'm was continuing to improve the site Sept 26th by adding caching of images and W3 cache but moz analytics is now saying I went from 288 medium issues to over 600 and i see the warning "45% of site pages served 302 redirects during the last crawl". I'm not sure how to fix this? I'm on WP using Yoast SEO so all the 301's I did are 301's not 302's. I do have SSL, could it be Http vs Https? I've asked this question before and two very nice people replied with suggestions which I tried to implement but couldn't, i got the WP white screen of death several times. They suggested the code below. Does anyone know how to implement this code or some other way to reduce the errors I'm getting? I've asked this at stackoverflow with no responses. "you have a lot of http & https issues so you should fix these with a bit of .htaccess code, as below. RewriteEngine On
On-Page Optimization | | cheaptubes
RewriteCond %{HTTPS} !=on
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [R,L] You also have some non-www to www issues. You can fix these in .htaccess at the same time... RewriteCond %{HTTP_HOST} !^www.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L] You should find this fixes a lot of your issues. Also check in your Wordpress general settings that the site is set to www.cheaptubes.com for both instances." When I tried to do as they suggested it gave me an internal server error. Please see the code below from .htaccess and the server error. I took it out for now. BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
RewriteEngine On RewriteCond %{HTTPS} !=on RewriteRule ^.$ https://%{SERVER_NAME}%{REQUEST_URI} [R,L]
RewriteCond %{HTTP_HOST} !^www. RewriteRule ^(.)$ http://www.%{HTTP_HOST}/$1 [R=301,L]</ifmodule> END WordPress Internal Server Error The server encountered an internal error or misconfiguration and was unable to complete your request. Please contact the server administrator, webmaster@cheaptubes.com and inform them of the time the error occurred, and anything you might have done that may have caused the error. More information about this error may be available in the server error log. Additionally, a 500 Internal Server Error error was encountered while trying to use an ErrorDocument to handle the request.0 -
Title Tags for a Joomla Website
Does anyone know what is the best way or plugin to use to easily get title tags inserted for a Joomla website? Thanks!
On-Page Optimization | | Gavo0 -
Duplicate content on partner site
I have a trade partner who will be using some of our content on their site. What's the best way to prevent any duplicate content issues? Their plan is to attribute the content to us using rel=author tagging. Would this be sufficient or should I request that they do something else too? Thanks
On-Page Optimization | | ShearingsGroup0 -
Not sure if I need to be concerned with duplicate content plus too many links
Someone else supports this site in terms of making changes so I want to make sure that I know what I am talking about before I speak to them about changes. We seem to have a lot of duplicate content and duplicate titles. This is an example http://www.commonwealthcontractors.com/tag/big-data-scientists/ of a duplicate. Do I need to get things changed? The other problem that crops up on reports is too many on page links. I am going to get shot of the block of tags but need to keep the news. Is there much else I can do? Many thanks.
On-Page Optimization | | Niamh20 -
Why Moz is showing Duplicate Page Content Issues?
We have a Career Section on our website. For each job post, there is a separate link of "Apply Job". Now Moz's Crawl Diagnostic is showing Duplicate page content for such URLs. Here are two such URLs: http://tiny.cc/em9nyw http://tiny.cc/bq9nyw Can any one please suggest on this? Thanks
On-Page Optimization | | chandman0 -
What Should I Do With Low Quality Content?
As my site has definitely got hit by Panda, I am in the process of cleaning my website of low quality content. Needless to say, shitty articles are completed being removed but I think lots of this content is now of low quality because it is obsolete and dated. So what should I do with this content? Should I rewrite those articles as completely new posts and link from the old posts to the new ones? Or should I delete the old posts and do a 301 redirect to the new post? Or should I rewrite the content of these articles in place so I can keep the old URL and backlinks? One thing is that I've got a lot more followers than I used to so publishing a new post gets a lot more views, like and shares and whatnot from social networks.
On-Page Optimization | | sbrault741 -
Duplicate page
Just getting started and had a question regarding one of the reports. It is telling me that I have duplicate pages but I'm not sure how to resolve that.
On-Page Optimization | | KeylimeSocial0