Why does SEOmoz bot see duplicate pages despite I am using the canonical tag?
-
Hello here,
today SEOmoz bot found and marked as "duplicate content" the following pages on my website:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf
And I am wondering why considering the fact I am using on both those pages a canonical tag pointing to the main product page below:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html
Shouldn't SEOmoz bot follow the canonical directive and not report those two pages as duplicate?
Thank you for any insights I am probably missing here!
-
Thank you Peter, I got your ticket reply.
That makes perfect sense, and as Dr. Peter pointed out on a different thread:
http://www.seomoz.org/q/why-seomoz-bot-consider-these-as-duplicate-pages
I was discussing this issue further, I was confused by your report.
Thank you again for your help and I hope you will improve your report interface to avoid such confusion related issues in the future.
Best,
Fabrizio
-
Hi there,
Thanks for reaching out to us, I replied to you in a support ticket, but I just wanted to share it everyone since I think it might be relevant to this discussion.
I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing, you can see the duplicate pages by clicking on the number to the right side of the link. These pages are considered duplicates because their canonical tags point to different URLs. For example:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3(Duplicate 1) is considered a duplicate of
http://www.virtualsheetmusic.com/score/PatrickCollectionVcPf.html?tab=mp3 (Duplicate 2)because the canonical tag for the first page is CANON1(http://screencast.com/t/tqvDZrLsyz8D) while the canonical for the second URL is CANON2 (http://screencast.com/t/FOguPJmK0).
Since the canonical tags point to different pages it is assumed that CANON1 and CANON2 are likely to be duplicates themselves.
Here is how our system interprets duplicate content vs. rel canonical:
Assuming A, B, C, and D are all duplicates,
If A references B as the canonical, then they are not considered duplicates
If A and B both reference C as canonical, A and B are not considered duplicates of each other
If A references C as a canonical, A and B are considered duplicated
If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.Hope that helps,
Best,
Peter
SEOmoz Help Team. -
Thinking furthermore, I don't see how these pages can be considered nearly duplicate since their content is quite different:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf
Thoughts??!!
-
Nobody can tell me why SEOmoz ignore my canonical tag definitions? According to some comments on the following thread:
http://www.seomoz.org/blog/visualizing-duplicate-web-pages
It should actually ignore pages with a canonical tag and NOT mark them as duplicate, but in my experience (as explained above), that's not been the case.
-
Ok, thank you, now I get the point... then here is my next question: is there a way to tell SEOmoz bot to ignore duplicate page with a defined canonical tag? If not, the SEOmoz duplicate page report is useless for me. I am not interested to know about duplicate page for which I have already defined a canonical tag for.
Thanks!
-
Canonical lets you pick which of the duplicates will be indexed. But Google still has to crawl the other pages when they could be crawling other parts of your site. It's an opportunity cost. If you can accept slower crawls, you can ignore the issue.
-
I am sorry, but I don't understand your point. If two pages are similar, we can use the canonical tag to "consolidate" them and avoid duplicate issues. Am I right? Or what are canonical tags for?
-
While I agree that SEOMOZ should better categorize duplicates that are canonical, the reason they still tell you it's duplicate is crawl budget. Remember, Google still has to crawl these duplicate pages and they could be crawling something else instead. Canonical only helps by letting you pick which duplicate content gets indexed. It's better to not have duplicate content than to have canonical duplicates.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Multiple h1 tags on this html 5 page a issue?
Hi Guys, I have a html5 page located here: https://tinyurl.com/yc6s3xs2 I know from some online discussions having multiple h1 tags on HTML 5 pages like this, shouldn't be an issue. Any thoughts on this? Cheers,
Intermediate & Advanced SEO | | bridhard80 -
How and When Should I use Canonical Url Tags?
Pretty new to the SEO universe. But I have not used any canonical tags, just because there is not definitive source explaining exactly when and why you should use them??? Am I the only one who feels this way?
Intermediate & Advanced SEO | | greenrushdaily0 -
Should I be using meta robots tags on thank you pages with little content?
I'm working on a website with hundreds of thank you pages, does it make sense to no follow, no index these pages since there's little content on them? I'm thinking this should save me some crawl budget overall but is there any risk in cutting out the internal links found on the thank you pages? (These are only standard site-wide footer and navigation links.) Thanks!
Intermediate & Advanced SEO | | GSO0 -
Duplicate Content Pages - A Few Queries..
I am working through the latest Moz Crawl Report and focusing on the 'high priority' issues of Duplicate Page Content. There are some strange instances being flagged and so wondered whether anyone has any knowledge as to why this may be happening... Here is an example; This page; http://www.bolsovercruiseclub.com/destinations/cruise-breaks-&-british-isles/bruges/ ...is apparently duplicated with these pages; http://www.bolsovercruiseclub.com/guides/excursions http://www.bolsovercruiseclub.com/guides/cruises-from-the-uk http://www.bolsovercruiseclub.com/cruise-deals/norwegian-star-europe-cruise-deals Not sure why...? Also, pages that are on our 'Cruise Reviews' section such as this page; http://www.bolsovercruiseclub.com/cruise-reviews/p&o-cruises/adonia/cruising/931 ...are being flagged as duplicated content with a page like this; http://www.bolsovercruiseclub.com/destinations/cruise-breaks-&-british-isles/bilbao/ Is this a 'thin content' issue i.e. 2 pages have 'thin content' and are therefore duplicated? If so, the 'destinations' page can (and will be) rewritten with more content (and images) but the 'cruise reviews' are written by customers and so we are unable to do anything there... Hope that all makes sense?! Andy
Intermediate & Advanced SEO | | TomKing0 -
Using two 404 NOT FOUND pages
Hi all, I was wondering if any of you can advise whether it's no issue to use two separate custom 404 pages. The 404 pages would be different for different parts of the site. For instance, if you're on /community/ and you enter a non-existing page on: www.sample.com/community/example/ it would give you a different 404 page than someone who runs into a non existing page at: www.sample.com/definition/example/ Does anybody have experience with this and would this be fine?
Intermediate & Advanced SEO | | RonFav0 -
Will Google View Using Google Translate As Duplicate?
If I have a page in English, which exist on 100 other websites, we have a case where my website has duplicate content. What if I use Google Translate to translate the page from English to Japanese, as the only website doing this translation will my page get credit for producing original content? Or, will Google view my page as duplicate content, because Google can tell it is translated from an original English page, which runs on 100+ different websites, since Google Translate is Google's own software?
Intermediate & Advanced SEO | | khi50 -
How does the use of Dynamic meta tags effect SEO?
I'm evaluating a new client site which was built buy another design firm. My question is they are dynamically creating meta tags and I'm concerned that it is hurting their SEO. When I view the page source this is what I see. <meta name="<a class="attribute-value">keywords</a>" id="<a class="attribute-value">keywordsGoHere</a>" content="" /> <meta name="<a class="attribute-value">description</a>" id="<a class="attribute-value">descriptionGoesHere</a>" content="" /> <title id="<a class="attribute-value">titleGoesHere</a>">title> To me it looks like the tags are not being added to the page, however the title is showing when you view it in a browser and if use a spider view tool, it sees the title. I'm guess it is being called from a DB. So I'm a little concerned though that the search engines are not really seeing the title and description. I'm not worried about the keywords tag. Can anyone shed some light on how this might work? Why it might not being showing the text for the description in the page code and if that will hurt SEO? Thanks for the help!
Intermediate & Advanced SEO | | BbeS0 -
Not using a robot command meta tag
Hi SEOmoz peeps. Was doing some research on robot commands and found a couple major sites that are not using them. If you check out the code for these: http://www.amazon.com http://www.zappos.com http://www.zappos.com/product/7787787/color/92100 http://www.altrec.com/ You fill not find a meta robot command line. Of course you need the line for any noindex, nofollow, noarchive pages. However for pages you want crawled and indexed, is there any benefit for not having the line at all? Thanks!
Intermediate & Advanced SEO | | STPseo0