Why does SEOmoz bot see duplicate pages despite I am using the canonical tag?

fablau

Hello here,

today SEOmoz bot found and marked as "duplicate content" the following pages on my website:

http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3

http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf

And I am wondering why considering the fact I am using on both those pages a canonical tag pointing to the main product page below:

http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html

Shouldn't SEOmoz bot follow the canonical directive and not report those two pages as duplicate?

Thank you for any insights I am probably missing here!

fablau

Thank you Peter, I got your ticket reply.

That makes perfect sense, and as Dr. Peter pointed out on a different thread:

http://www.seomoz.org/q/why-seomoz-bot-consider-these-as-duplicate-pages

I was discussing this issue further, I was confused by your report.

Thank you again for your help and I hope you will improve your report interface to avoid such confusion related issues in the future.

Best,

Fabrizio

Peterli

Hi there,

Thanks for reaching out to us, I replied to you in a support ticket, but I just wanted to share it everyone since I think it might be relevant to this discussion.

I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing, you can see the duplicate pages by clicking on the number to the right side of the link. These pages are considered duplicates because their canonical tags point to different URLs. For example:

http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3(Duplicate 1) is considered a duplicate of
http://www.virtualsheetmusic.com/score/PatrickCollectionVcPf.html?tab=mp3 (Duplicate 2)

because the canonical tag for the first page is CANON1(http://screencast.com/t/tqvDZrLsyz8D) while the canonical for the second URL is CANON2 (http://screencast.com/t/FOguPJmK0).

Since the canonical tags point to different pages it is assumed that CANON1 and CANON2 are likely to be duplicates themselves.

Here is how our system interprets duplicate content vs. rel canonical:

Assuming A, B, C, and D are all duplicates,

If A references B as the canonical, then they are not considered duplicates
If A and B both reference C as canonical, A and B are not considered duplicates of each other
If A references C as a canonical, A and B are considered duplicated
If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.

Hope that helps,

Best,

Peter
SEOmoz Help Team.

fablau

Thinking furthermore, I don't see how these pages can be considered nearly duplicate since their content is quite different:

http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3

http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf

Thoughts??!!

fablau

Nobody can tell me why SEOmoz ignore my canonical tag definitions? According to some comments on the following thread:

http://www.seomoz.org/blog/visualizing-duplicate-web-pages

It should actually ignore pages with a canonical tag and NOT mark them as duplicate, but in my experience (as explained above), that's not been the case.

fablau

Ok, thank you, now I get the point... then here is my next question: is there a way to tell SEOmoz bot to ignore duplicate page with a defined canonical tag? If not, the SEOmoz duplicate page report is useless for me. I am not interested to know about duplicate page for which I have already defined a canonical tag for.

Thanks!

Highland

Canonical lets you pick which of the duplicates will be indexed. But Google still has to crawl the other pages when they could be crawling other parts of your site. It's an opportunity cost. If you can accept slower crawls, you can ignore the issue.

fablau

I am sorry, but I don't understand your point. If two pages are similar, we can use the canonical tag to "consolidate" them and avoid duplicate issues. Am I right? Or what are canonical tags for?

Highland

While I agree that SEOMOZ should better categorize duplicates that are canonical, the reason they still tell you it's duplicate is crawl budget. Remember, Google still has to crawl these duplicate pages and they could be crawling something else instead. Canonical only helps by letting you pick which duplicate content gets indexed. It's better to not have duplicate content than to have canonical duplicates.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Why does SEOmoz bot see duplicate pages despite I am using the canonical tag?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Crawling/indexing of near duplicate product pages

Google Ignoring Canonical Tag for Hundreds of Sites

Duplicate currency page variations?

Using on two pages a keyword in alternative language in the title

How would you handle this duplicate content - noindex or canonical?

H1 Page Title Tag Placement

REL canonicals not fixing duplicate issue

Rel canonical and duplicate subdomains