Why does SEOmoz bot see duplicate pages despite I am using the canonical tag?
-
Hello here,
today SEOmoz bot found and marked as "duplicate content" the following pages on my website:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf
And I am wondering why considering the fact I am using on both those pages a canonical tag pointing to the main product page below:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html
Shouldn't SEOmoz bot follow the canonical directive and not report those two pages as duplicate?
Thank you for any insights I am probably missing here!
-
Thank you Peter, I got your ticket reply.
That makes perfect sense, and as Dr. Peter pointed out on a different thread:
http://www.seomoz.org/q/why-seomoz-bot-consider-these-as-duplicate-pages
I was discussing this issue further, I was confused by your report.
Thank you again for your help and I hope you will improve your report interface to avoid such confusion related issues in the future.
Best,
Fabrizio
-
Hi there,
Thanks for reaching out to us, I replied to you in a support ticket, but I just wanted to share it everyone since I think it might be relevant to this discussion.
I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing, you can see the duplicate pages by clicking on the number to the right side of the link. These pages are considered duplicates because their canonical tags point to different URLs. For example:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3(Duplicate 1) is considered a duplicate of
http://www.virtualsheetmusic.com/score/PatrickCollectionVcPf.html?tab=mp3 (Duplicate 2)because the canonical tag for the first page is CANON1(http://screencast.com/t/tqvDZrLsyz8D) while the canonical for the second URL is CANON2 (http://screencast.com/t/FOguPJmK0).
Since the canonical tags point to different pages it is assumed that CANON1 and CANON2 are likely to be duplicates themselves.
Here is how our system interprets duplicate content vs. rel canonical:
Assuming A, B, C, and D are all duplicates,
If A references B as the canonical, then they are not considered duplicates
If A and B both reference C as canonical, A and B are not considered duplicates of each other
If A references C as a canonical, A and B are considered duplicated
If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.Hope that helps,
Best,
Peter
SEOmoz Help Team. -
Thinking furthermore, I don't see how these pages can be considered nearly duplicate since their content is quite different:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf
Thoughts??!!
-
Nobody can tell me why SEOmoz ignore my canonical tag definitions? According to some comments on the following thread:
http://www.seomoz.org/blog/visualizing-duplicate-web-pages
It should actually ignore pages with a canonical tag and NOT mark them as duplicate, but in my experience (as explained above), that's not been the case.
-
Ok, thank you, now I get the point... then here is my next question: is there a way to tell SEOmoz bot to ignore duplicate page with a defined canonical tag? If not, the SEOmoz duplicate page report is useless for me. I am not interested to know about duplicate page for which I have already defined a canonical tag for.
Thanks!
-
Canonical lets you pick which of the duplicates will be indexed. But Google still has to crawl the other pages when they could be crawling other parts of your site. It's an opportunity cost. If you can accept slower crawls, you can ignore the issue.
-
I am sorry, but I don't understand your point. If two pages are similar, we can use the canonical tag to "consolidate" them and avoid duplicate issues. Am I right? Or what are canonical tags for?
-
While I agree that SEOMOZ should better categorize duplicates that are canonical, the reason they still tell you it's duplicate is crawl budget. Remember, Google still has to crawl these duplicate pages and they could be crawling something else instead. Canonical only helps by letting you pick which duplicate content gets indexed. It's better to not have duplicate content than to have canonical duplicates.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I put rel next and rel prev and canonical on tags pages
Hi I have a tag pages on a news website each tag page is divided to several pages, but Google does't crawled those pages because the links are in javaScript, I want to do the following things: Change the links to html href Add rel=pref rel=next Add a canonical in each page with the url of the main tag page Do you agree with my solution? Thanks Roy
Intermediate & Advanced SEO | | kadut1 -
How and When Should I use Canonical Url Tags?
Pretty new to the SEO universe. But I have not used any canonical tags, just because there is not definitive source explaining exactly when and why you should use them??? Am I the only one who feels this way?
Intermediate & Advanced SEO | | greenrushdaily0 -
Do I put a canonical tag on the page I am pointing to?
Lets say B i a duplicate page of A (main page). I understand I have to put canonical tag under B to point to A. Do I also put canonical tag under the main page A? Is it necessary? I understand that A would then tell Google that it is preferred page of A? Is this a correct understanding?
Intermediate & Advanced SEO | | andypatalak0 -
Shall i mark my tag pages as nofollow
Ran moz crawl, the tag pages are coming up with missing description Is it okay if the tag been 'noindexed', though they are not coming in as duplicate one Some examples been Gagan Modi - Blog entries tagged in finance
Intermediate & Advanced SEO | | Modi
http://www.mycarhelpline.com/index.php?option=com_easyblog&view=blogger&layout=statistic&id=128&
stat=tag&tagid=67&Itemid=91 Gagan Modi - Blog entries tagged in nissan
http://www.mycarhelpline.com/index.php?option=com_easyblog&view=blogger&layout=statistic&id=128&
stat=tag&tagid=68&Itemid=91 Gagan Modi - Blog entries tagged in dc avanti
http://www.mycarhelpline.com/index.php?option=com_easyblog&view=blogger&layout=statistic&id=128&
stat=tag&tagid=69&Itemid=91 Gagan Modi - Blog entries tagged in mahindra
http://www.mycarhelpline.com/index.php?option=com_easyblog&view=blogger&layout=statistic&id=128&
stat=tag&tagid=7&Itemid=91 Gagan Modi - Blog entries tagged in budget
http://www.mycarhelpline.com/index.php?option=com_easyblog&view=blogger&layout=statistic&id=128&
stat=tag&tagid=72&Itemid=91 Gagan Modi - Blog entries tagged in datsun
http://www.mycarhelpline.com/index.php?option=com_easyblog&view=blogger&layout=statistic&id=128&
stat=tag&tagid=73&Itemid=910 -
Wordpresss Bug? Duplicate pages yet again, Adviced Needed
I have work up this morning with a large number of missing meta description, when I looked at th pages (knowing I have to missing metas) I have duplicates of my pages that look like this, These pages normally look like www.finalduties.co.uk/ NAME OF POST now i have urls with numbers in, seems as though all my blog posts have duplicated.. Now could this be down to a wordpress bug? I am so fed up, I had all my errors all cleared up only to wake and have pages like this, which is going to cause me a problem because my pages are now duplicating.. please help, advice needed from someone that knows wordpress? | Father and son battle over 13th century inheritance http://www.finalduties.co.uk/?p=1006 1 0 Court of Appeal rules against RSCPA http://www.finalduties.co.uk/?p=1007 1 0 Life insurance policy payout can skip probate process http://www.finalduties.co.uk/?p=1008 1 0 Government clamping down on inheritance tax avoidance http://www.finalduties.co.uk/?p=1010 |
Intermediate & Advanced SEO | | Chris__Chris0 -
Meta Refresh tag on cache pages- GRRR!
Hi guys, All of our product pages originate in a URL with a unique number but it redirects to an SEO url for the user. These product pages have blocks on the page and these blocks are automatically populated with our database of content. Here's an example of the redirect in place: www.example.com/45643/xxxx.html redirects to www.example.com/seo-friendly-url.html The development team did this for 2 reasons. 1) our internal search needs the unique numbered urls for search and 2) it allows quick redirects as pages are cached. The problem I face is this, the redirects from the cached are being tagged with 'meta refresh', yup, they are 302. The development team said they could stop caching and respond dynamically with a 301 but this would bring in a delay. Speed wise, the cached pages load within 22ms and dynamically 530ms, so yeah half a second more. Currently cached pages just do a meta refresh tagged redirect and I want to move away from this. What would you guys recommend in such a situation? I feel like unless I place a 301, I'll be losing out on rank juice.
Intermediate & Advanced SEO | | Bio-RadAbs0 -
SEOmoz is only crawling 2 pages out of my website
I have checked on Google Webmaster and they are crawling around 118 pages our of my website, store.itpreneurs.com but SEOmoz is only crawling 2 pages. Can someone help me? Thanks Diogo
Intermediate & Advanced SEO | | jslusser0 -
Having Content be the First thing the bots see
If you have all of your homepage content in a tab set at the bottom of the page, but really would want that to be the first thing Google reads when it crawls your site, is there something you can implement where Google reads your content first before it reads the rest of your site? Does this cause any violations or are there any red flags that get raised from doing this? The goal here would just be to get Google to read the content first, not hide any content
Intermediate & Advanced SEO | | imageworks-2612900