How critical is Duplicate content warnings?
-
Hi,
So I have created my first campaign here and I have to say the tools, user interface and the on-page optimization, everything is useful and I am happy with SEOMOZ.
However, the crawl report returned thousands of errors and most of them are duplicate content warnings.
As we use Drupal as our CMS, the duplicate content is caused by Drupal's pagination problems. Let's say there is a page called "/top5list" , the crawler decided /top5list?page=1" to be duplicate of "/top5list". There is no real solution for pagination problems in Drupal (as far as I know).
I don't have any warnings in Google's webmaster tools regarding this and my sitemap I submitted to Google doesn't include those problematic deep pages. (that are detected as duplicate content by SEOMOZ crawler)
So my question is, should I be worried about the thousands of error messages in crawler diagnostics?
any ideas appreciated
-
Personally, I'd keep an eye on it. These things do have a way of expanding over time, so you may want to be proactive. At the moment, though, you probably don't have to lose sleep over it.
-
Thanks for that command Dr. Meyers. Apparently, only 5 such pages are indexed. I suppose I shouldn't worry about this then?
-
One clarification one Vahe's answer - if these continue (?page=2, ?page=3, etc.) then it's traditional pagination. You could use the GWT solution Adam mentioned, although, honestly, I find it's hit-or-miss. It is simpler than other solution. The "ideal" Google solution is very hard to implement (and I actually have issues with it). The other option is to META NOINDEX the variants, but that would take adjusting the template code dynamically.
If it's just an issue of a bunch of "page=1" duplicates, and this isn't "true" pagination, then canonical tags are probably your best bet. There may be a Drupal plug-in or fix - unfortunately, I don't have much Drupal experience.
The question is whether these pages are being indexed by Google, and how many of them there are. At large scale, these kinds of near-duplicates can dilute your index, harm rankings, and even contribute to Panda issues. At smaller scale, though, they might have no impact at all. So, it's not always clear cut, and you have to work the risk/cost calculation.
You can run a command in Google like:
site:example.com inurl:page=
...and try to get a sense of how much of this content is being indexed.
The GWT approach won't hurt, and it's fine to try. I just find that Google doesn't honor it consistently.
-
Thanks Adam and Vahe. Your suggestions are definitely helpful.
-
For pagination problem's it would be better to use this cannonical method- http://googlewebmastercentral.blogspot.com.au/2012/03/video-about-pagination-with-relnext-and.html .
Having dup content in the form paginated results will not penalise you, rather the page/link equity will be split between all these pages. This means you would need to spend more time and energy on the original page to outrank your competitors.
To see these errors in Google Webmaster Tools you should go to the HTML sections area where it will review the sites meta data. I'm sure ull find the same issues there, instead of the sitemaps.
So to improve the overall health of your website, I would suggest that you do try and verify this issue.
Hope this helps. Any issues, best to contact me directly.
Regards,
Vahe
-
OK, this is just what I've done, and it might not work for everyone.
As far as I can tell, the duplicate content warnings do not hurt my rankings, I don't think. When I first signed up for SEOMoz they really alarmed me. If they are hurting my rankings, it's not much - as we preform well in many competitive keywords for our industry, and our website traffic has been growing ~20% year over year for many years now.
The fix for auto-generated duplicate content on our site (which I inherited as my responsibility when I started at my company) would be very expensive. It's something I plan on doing eventually along with some other overhauls, but right now it's not in the budget, because it would basically involve re-architecting how the site and databases function on the back end (ugh).
So, in order to help mitigate any issues and help keep Google from indexing all the duplicate content that can be generated by our system, I use the "URL Parameters" setting in Google Webmaster Tools (under Site Configuration). I've set up a few parameters for Google to specifically NOT INDEX, to keep the duplicate content out of the search engine. I've also set some parameters to specifically reenforce content I want indexed (along with including the original content in sitemaps I've curated myself, rather than having auto-generated sitemaps potentially polluted with duplicate content).
My thinking is that while Roger the SEOMoz bot is still finding this stuff and generating warnings, Googlebot is not.
I don't work at an agency - I'm in-house and I've hard to learn everything by trial and error and often fly by the seat of my pants with this sort of thing. So my conclusion/solutions may be wrong or not work for you, but it seems to work for me.
It's a band-aid fix at best, but it seems to be better than nothing!
Hope this helps,
-Adam
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content and Subdirectories
Hi there and thank you in advance for your help! I'm seeking guidance on how to structure a resources directory (white papers, webinars, etc.) while avoiding duplicate content penalties. If you go to /resources on our site, there is filter function. If you filter for webinars, the URL becomes /resources/?type=webinar We didn't want that dynamic URL to be the primary URL for webinars, so we created a new page with the URL /resources/webinar that lists all of our webinars and includes a featured webinar up top. However, the same webinar titles now appear on the /resources page and the /resources/webinar page. Will that cause duplicate content issues? P.S. Not sure if it matters, but we also changed the URLs for the individual resource pages to include the resource type. For example, one of our webinar URLs is /resources/webinar/forecasting-your-revenue Thank you!
Technical SEO | | SAIM_Marketing0 -
Duplicate content : domain alias issue
Hello there ! Let's say my client has 2 webshops (that exists since long time, so many backlinks & good authority on both) : individuals.nl : for individuals (has 200 backlinks, let's say) pros.nl : exact same products, exact same content, but with a different branding intended to professionnals (has 100 backlinks, let's say) So, both websites are 99% identical and it has to remain like that !!! Obviously, this creates duplicate content issues. Goal : I want "individuals.nl" to get all ranking value (while "pros.nl" should remain accessible through direct access & appear on it's own brand queries). Solution ? Implement canonical tags on "pros**.nl**" that goes to "individuals.nl". That way, "individuals.nl" will get all ranking value, while "pros.nl" will still be reachable through direct access. However, "individuals.nl" will then replace "pros.nl" from SERP in the long-term. The only thing I want is to keep "pros.nl" visible for its own brand queries -> it won't be possible through organic search result, so, I'm just gonna buy those "pros" queries through paid search ! Put links on all pages of pros.nl to individuals.nl (but not the other way around), so that "pros.nl" will pass some ranking value to "individuals.nl" (but only a small part of the ranking value -> ideally, I would like to pass all link value to this domain). Could someone advise me ??? (I know it sound a bit complicated... but I don't have much choice ^^)
Technical SEO | | Netsociety0 -
Responsive Code Creating Duplicate Content Issue
Good morning, Our developers have recently created a new site for our agency. The site is responsive for mobile/tablets. I've just put the site through Screaming Frog and I've been informed of duplicate H2s. When I've looked at some of the page sources, there are some instances of duplicated H2s and duplicated content. These duplicates don't actually appear on the site, only in the code. When I asked the development guys about this, they advised this is duplicated because of the code for the responsive site. Will the site be negatively affected because of this? Not everything is duplicated, which leads me to believe it probably could have been designed better... but I'm no developer so don't know for sure. I've checked the code for other responsive sites and no duplicates can be found. Thanks in advance, Lewis
Technical SEO | | PeaSoupDigital0 -
Joomla: content accesible through all kinds of other links >> duplicate content?!
When i did a site: search on Google i've noticed all kind of URL's on my site were indexed, while i didn't add them to the Joomla navigation (or they were not linked anywhere on the site). Some examples: www.domain.com/1-articlename >> that way ALL articles are publicly visible, even if they are not linked to a menu-item... If by accident such a link get's shared it will be indexed in google, you can have 2 links with same content... www.domain.com/2-uncategorised >> same with categories, automatically these overview pages are visible to people who know this URL. On it you see all the articles that belong to that category. www.domain.com/component/content >> this gives an overview of all the categories inside your Joomla CMS I think most will agree this is not good for your site's SEO? But how can this be solved? Is this some kind of setting within Joomla? Anyone who dealt with these problems already?
Technical SEO | | conversal0 -
How to prevent duplicate content at a calendar page
Hi, I've a calender page which changes every day. The main url is
Technical SEO | | GeorgFranz
/calendar For every day, there is another url: /calendar/2012/09/12
/calendar/2012/09/13
/calendar/2012/09/14 So, if the 13th september arrives, the content of the page
/calendar/2012/09/13
will be shown at
/calendar So, it's duplicate content. What to do in this situation? a) Redirect from /calendar to /calendar/2012/09/13 with 301? (but the redirect changes the day after to /calendar/2012/09/14) b) Redirect from /calendar to /calendar/2012/09/13 with 302 (but I will loose the link juice of /calendar?) c) Add a canonical tag at /calendar (which leads to /calendar/2012/09/13) - but I will loose the power of /calendar (?) - and it will change every day... Any ideas or other suggestions? Best wishes, Georg.0 -
Duplicate page content
hi I am getting an duplicate content error in SEOMoz on one of my websites it shows http://www.exampledomain.co.uk http://www.exampledomain.co.uk/ http://www.exampledomain.co.uk/index.html how can i fix this? thanks darren
Technical SEO | | Bristolweb0 -
Help With Joomla Duplicate Content
Need another set of eyes on my site from someone with Joomla experience. I'm running Joomla 2.5 (latest version) and SEOmoz is giving my duplicate content errors on a lot of my pages. I checked my sitemap, I checked my menus, and I checked my links, and I can't figure out how SEOmoz is finding the alternate paths to my content. Home page is: http://www.vipfishingcharters.com/ There's only one menu at the top. Take the first link "Dania Beach" under fishing charters for example. This generates the SEF url: http://www.vipfishingcharters.com/fishing-charters/broward-county/dania-beach-fishing-charters-and-fishing-boats.html Somehow SEOmoz (and presumably all other robots) are finding duplicate content at: http://www.vipfishingcharters.com/broward-county/dania-beach-fishing-charters-and-fishing-boats.html SEOmoz says the referrer is the homepage/root. The first URL is constructed using the menu aliases. The second one is constructed using the Joomla category and article alias. Where is it getting this and how can I stop it? <colgroup><col width="601"></colgroup>
Technical SEO | | NoahC0 -
How do i deal with duplicate content on the same domain?
I'm trying to find out if there's a way we can combat similar content on different pages on the same site, without having to re write the whole lot? Any ideas?
Technical SEO | | indurain0