How critical is Duplicate content warnings?
-
Hi,
So I have created my first campaign here and I have to say the tools, user interface and the on-page optimization, everything is useful and I am happy with SEOMOZ.
However, the crawl report returned thousands of errors and most of them are duplicate content warnings.
As we use Drupal as our CMS, the duplicate content is caused by Drupal's pagination problems. Let's say there is a page called "/top5list" , the crawler decided /top5list?page=1" to be duplicate of "/top5list". There is no real solution for pagination problems in Drupal (as far as I know).
I don't have any warnings in Google's webmaster tools regarding this and my sitemap I submitted to Google doesn't include those problematic deep pages. (that are detected as duplicate content by SEOMOZ crawler)
So my question is, should I be worried about the thousands of error messages in crawler diagnostics?
any ideas appreciated
-
Personally, I'd keep an eye on it. These things do have a way of expanding over time, so you may want to be proactive. At the moment, though, you probably don't have to lose sleep over it.
-
Thanks for that command Dr. Meyers. Apparently, only 5 such pages are indexed. I suppose I shouldn't worry about this then?
-
One clarification one Vahe's answer - if these continue (?page=2, ?page=3, etc.) then it's traditional pagination. You could use the GWT solution Adam mentioned, although, honestly, I find it's hit-or-miss. It is simpler than other solution. The "ideal" Google solution is very hard to implement (and I actually have issues with it). The other option is to META NOINDEX the variants, but that would take adjusting the template code dynamically.
If it's just an issue of a bunch of "page=1" duplicates, and this isn't "true" pagination, then canonical tags are probably your best bet. There may be a Drupal plug-in or fix - unfortunately, I don't have much Drupal experience.
The question is whether these pages are being indexed by Google, and how many of them there are. At large scale, these kinds of near-duplicates can dilute your index, harm rankings, and even contribute to Panda issues. At smaller scale, though, they might have no impact at all. So, it's not always clear cut, and you have to work the risk/cost calculation.
You can run a command in Google like:
site:example.com inurl:page=
...and try to get a sense of how much of this content is being indexed.
The GWT approach won't hurt, and it's fine to try. I just find that Google doesn't honor it consistently.
-
Thanks Adam and Vahe. Your suggestions are definitely helpful.
-
For pagination problem's it would be better to use this cannonical method- http://googlewebmastercentral.blogspot.com.au/2012/03/video-about-pagination-with-relnext-and.html .
Having dup content in the form paginated results will not penalise you, rather the page/link equity will be split between all these pages. This means you would need to spend more time and energy on the original page to outrank your competitors.
To see these errors in Google Webmaster Tools you should go to the HTML sections area where it will review the sites meta data. I'm sure ull find the same issues there, instead of the sitemaps.
So to improve the overall health of your website, I would suggest that you do try and verify this issue.
Hope this helps. Any issues, best to contact me directly.
Regards,
Vahe
-
OK, this is just what I've done, and it might not work for everyone.
As far as I can tell, the duplicate content warnings do not hurt my rankings, I don't think. When I first signed up for SEOMoz they really alarmed me. If they are hurting my rankings, it's not much - as we preform well in many competitive keywords for our industry, and our website traffic has been growing ~20% year over year for many years now.
The fix for auto-generated duplicate content on our site (which I inherited as my responsibility when I started at my company) would be very expensive. It's something I plan on doing eventually along with some other overhauls, but right now it's not in the budget, because it would basically involve re-architecting how the site and databases function on the back end (ugh).
So, in order to help mitigate any issues and help keep Google from indexing all the duplicate content that can be generated by our system, I use the "URL Parameters" setting in Google Webmaster Tools (under Site Configuration). I've set up a few parameters for Google to specifically NOT INDEX, to keep the duplicate content out of the search engine. I've also set some parameters to specifically reenforce content I want indexed (along with including the original content in sitemaps I've curated myself, rather than having auto-generated sitemaps potentially polluted with duplicate content).
My thinking is that while Roger the SEOMoz bot is still finding this stuff and generating warnings, Googlebot is not.
I don't work at an agency - I'm in-house and I've hard to learn everything by trial and error and often fly by the seat of my pants with this sort of thing. So my conclusion/solutions may be wrong or not work for you, but it seems to work for me.
It's a band-aid fix at best, but it seems to be better than nothing!
Hope this helps,
-Adam
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content
Hello mozzers, I have an unusual question. I've created a page that I am fully aware that it is near 100% duplicate content. It quotes the law, so it's not changeable. The page is very linkable in my niche. Is there a way I can build quality links to it that benefit my overall websites DA (i'm not bothered about the linkable page being ranked) without risking panda/dupe content issues? Thanks, Peter
Technical SEO | | peterm21 -
Duplicate content on job sites
Hi, I have a question regarding job boards. Many job advertisers will upload the same job description to multiple websites e.g. monster, gumtree, etc. This would therefore be viewed as duplicate content. What is the best way to handle this if we want to ensure our particular site ranks well? Thanks in advance for the help. H
Technical SEO | | HiteshP0 -
Duplicate Content
Hello guys, After fixing the rel tag on similar pages on the site I thought that duplicate content issue were resolved. I checked HTML Improvements on GWT and instead of going down as I expected, it went up. The duplicate issues affect identical product pages which differ from each other just for one detail, let's say length or colour. I could write different meta tags as the duplicate is the meta description, and I did it for some products but still didn't have any effects and they are still showing as duplicates. What would the problem be? Cheers
Technical SEO | | PremioOscar0 -
What is the best practice to handle duplicate content?
I have several large sections that SEOMOZ is indicating has duplicate content, even though the content is not identical. For example: Leather Passport Section - Leather Passports - Black - Leather Passposts - Blue - Leather Passports - Tan - Etc. Each of the items has good content, but it is identical, since they are the same products. What is the best practice here: 1. Have only one product with a drop down (fear is that this is not best for the customer) 2. Make up content to have them sound different? 3. Put a do-no-follow on the passport section? 4. Use a rel canonical even though the sections are technically not identical? Thanks!
Technical SEO | | trophycentraltrophiesandawards0 -
I am trying to correct error report of duplicate page content. However I am unable to find in over 100 blogs the page which contains similar content to the page SEOmoz reported as having similar content is my only option to just dlete the blog page?
I am trying to correct duplicate content. However SEOmoz only reports and shows the page of duplicate content. I have 5 years worth of blogs and cannot find the duplicate page. Is my only option to just delete the page to improve my rankings. Brooke
Technical SEO | | wianno1680 -
Duplicate Content on Multinational Sites?
Hi SEOmozers Tried finding a solution to this all morning but can't, so just going to spell it out and hope someone can help me! Pretty simple, my client has one site www.domain.com. UK-hosted and targeting the UK market. They want to launch www.domain.us, US-hosted and targeting the US market. They don't want to set up a simple redirect because a) the .com is UK-hosted b) there's a number of regional spelling changes that need to be made However, most of the content on domain.com applies to the US market and they want to copy it onto the new website. Are there ways to get around any duplicate content issues that will arise here? Or is the only answer to simply create completely unique content for the new site? Any help much appreciated! Thanks
Technical SEO | | Coolpink0 -
Duplicate Page Content Lists the same page twice?
When checking my crawl diagnostics this morning I see that I have the error Duplicate page content. It lists the exact same url twice though and I don't understand how to fix this. It's also listed under duplicate page title. Personal Assistant | Virtual Assistant | Charlotte, NC http://charlottepersonalassistant.com/110 Personal Assistant | Virtual Assistant | Charlotte, NC http://charlottepersonalassistant.com/110 Does this have anything to do with a 301 redirect here? Why does it have http;// twice? Thanks all! | http://www.charlottepersonalassistant.com/ | http://http://charlottepersonalassistant.com/ |
Technical SEO | | eidna220 -
Duplicate Content issue
I have been asked to review an old website to an identify opportunities for increasing search engine traffic. Whilst reviewing the site I came across a strange loop. On each page there is a link to printer friendly version: http://www.websitename.co.uk/index.php?pageid=7&printfriendly=yes That page also has a link to a printer friendly version http://www.websitename.co.uk/index.php?pageid=7&printfriendly=yes&printfriendly=yes and so on and so on....... Some of these pages are being included in Google's index. I appreciate that this can't be a good thing, however, I am not 100% sure as to the extent to which it is a bad thing and the priority that should be given to getting it sorted. Just wandering what views people have on the issues this may cause?
Technical SEO | | CPLDistribution0