Why does SEOmoz bot see duplicate pages despite I am using the canonical tag?
-
Hello here,
today SEOmoz bot found and marked as "duplicate content" the following pages on my website:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf
And I am wondering why considering the fact I am using on both those pages a canonical tag pointing to the main product page below:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html
Shouldn't SEOmoz bot follow the canonical directive and not report those two pages as duplicate?
Thank you for any insights I am probably missing here!
-
Thank you Peter, I got your ticket reply.
That makes perfect sense, and as Dr. Peter pointed out on a different thread:
http://www.seomoz.org/q/why-seomoz-bot-consider-these-as-duplicate-pages
I was discussing this issue further, I was confused by your report.
Thank you again for your help and I hope you will improve your report interface to avoid such confusion related issues in the future.
Best,
Fabrizio
-
Hi there,
Thanks for reaching out to us, I replied to you in a support ticket, but I just wanted to share it everyone since I think it might be relevant to this discussion.
I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing, you can see the duplicate pages by clicking on the number to the right side of the link. These pages are considered duplicates because their canonical tags point to different URLs. For example:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3(Duplicate 1) is considered a duplicate of
http://www.virtualsheetmusic.com/score/PatrickCollectionVcPf.html?tab=mp3 (Duplicate 2)because the canonical tag for the first page is CANON1(http://screencast.com/t/tqvDZrLsyz8D) while the canonical for the second URL is CANON2 (http://screencast.com/t/FOguPJmK0).
Since the canonical tags point to different pages it is assumed that CANON1 and CANON2 are likely to be duplicates themselves.
Here is how our system interprets duplicate content vs. rel canonical:
Assuming A, B, C, and D are all duplicates,
If A references B as the canonical, then they are not considered duplicates
If A and B both reference C as canonical, A and B are not considered duplicates of each other
If A references C as a canonical, A and B are considered duplicated
If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.Hope that helps,
Best,
Peter
SEOmoz Help Team. -
Thinking furthermore, I don't see how these pages can be considered nearly duplicate since their content is quite different:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf
Thoughts??!!
-
Nobody can tell me why SEOmoz ignore my canonical tag definitions? According to some comments on the following thread:
http://www.seomoz.org/blog/visualizing-duplicate-web-pages
It should actually ignore pages with a canonical tag and NOT mark them as duplicate, but in my experience (as explained above), that's not been the case.
-
Ok, thank you, now I get the point... then here is my next question: is there a way to tell SEOmoz bot to ignore duplicate page with a defined canonical tag? If not, the SEOmoz duplicate page report is useless for me. I am not interested to know about duplicate page for which I have already defined a canonical tag for.
Thanks!
-
Canonical lets you pick which of the duplicates will be indexed. But Google still has to crawl the other pages when they could be crawling other parts of your site. It's an opportunity cost. If you can accept slower crawls, you can ignore the issue.
-
I am sorry, but I don't understand your point. If two pages are similar, we can use the canonical tag to "consolidate" them and avoid duplicate issues. Am I right? Or what are canonical tags for?
-
While I agree that SEOMOZ should better categorize duplicates that are canonical, the reason they still tell you it's duplicate is crawl budget. Remember, Google still has to crawl these duplicate pages and they could be crawling something else instead. Canonical only helps by letting you pick which duplicate content gets indexed. It's better to not have duplicate content than to have canonical duplicates.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
If I use links intag instead of "ahref" tag can Google read links inside div tag?
Hi All, Need a suggestion on it. For buttons, I am using links in tag instead of "ahref". Do you know that can Google read links inside "div" tag? Does it pass rank juice? It will be great if you can provide any reference if possible.
Intermediate & Advanced SEO | | pujan.bikroy0 -
Heading Tags (Specifically H2) being used within images
Hello, Mozzers I have a question regarding placement of heading tags. I have seen this asked a few times on the forum but some are from a couple years ago so wanted to get a more up to date answer regarding this. We want to add H2 tags across our site but our two options are to wrap images we are using as navigation on the top of the page, these are directly below our pages H1 tag and actually make sense. Example H1 title: Vehicles Images are specific brand logo with H2 being wrapped to pull the img alt: "Ford Vehicles" "Checvy vehicles" etc. The wrap would look something like this: I appreciate your time, Chris
Intermediate & Advanced SEO | | kirin443550 -
Pages with similar content: Redirect or Canonical? Or something else?
We have two pages on our site with similar content. One was originally a landing page for a marketing campaign, somewhat of a micro-site feel with a lot of content. We recently optimized another page on the site with much of the same content from the original landing page/micro-site. In order to avoid duplicate content, and to let Google know our authority page is the new page, we're wondering what is best practice: Should we... 301 redirect the old page? No index the old page? Keep both pages and use a canonical to tell Google the new page is authority? Or something else?
Intermediate & Advanced SEO | | seo_1234b0 -
How can a website have multiple pages of duplicate content - still rank?
Can you have a website with multiple pages of the exact same copy, (being different locations of a franchise business), and still be able to rank for each individual franchise? Is that possible?
Intermediate & Advanced SEO | | OhYeahSteve0 -
Use of Rel=Canonical
I have been pondering whether I am using this tag correctly or not. We have a custom solution which lays out products in the typical eCommerce style with plenty of tick box filters to further narrow down the view. When I last researched this it seemed like a good idea to implement rel=canonical to point all sub section pages at a 'view-all' page which returns all the products unfiltered for that given section. Normally pages are restricted down to 9 results per page with interface options to increase that. This combined with all the filters we offer creates many millions of possible page permutations and hence the need for the Canonical tag. I am concerned because our view-all pages get large, returning all of that section's product into one place.If I pointed the view-all page at say the first page of x results would that defeat the object of the view-all suggestion that Google made a few years back as it would require further crawling to get at all the data? Alternatively as these pages are just product listings, would NoIndex be a better route to go given that its unlikely they will get much love in Google anyway?
Intermediate & Advanced SEO | | motiv80 -
URL Parameters Duplicate Page Title
Thanks in advance, I'm getting duplicate page titles because seomoz keeps crawling through my url parameters. I added forcefiltersupdate to the URL parameters in webmaster tools but it has not seemed to have an effect. Below is an example of the duplicate content issue that I am having. http://qlineshop.com/OC/index.php?route=product/category&path=59_62&forcefiltersupdate=true&checkedfilters[]=a.13.13.387baf0199e7c9cc944fae94e96448fa Any thoughts? Thanks again. -Patrick
Intermediate & Advanced SEO | | bamron0 -
Rel=canonical on image pages
Hi, Im working on a Wordpress hosted blog site. I recently did a "site:search" in Google for a specific article page to make sure it was getting crawled, and it returned three separate URLs in the search results. One was the article page, and the other two were the URLs that hosted the images that are found in the article. Would you suggest adding the rel=canonical tag to the pages that host the images so they point back to the actual context article page? Or are they fine being left alone? Thank you!
Intermediate & Advanced SEO | | dbfrench0 -
Would you use images inside H1 tags?
Hi everyone I know what you are thinking but I am being serious. Would you use images inside H1 tags? Personally I don't see the benefit having an image included within the H1 tags but when looking at the Apple website today they actually did this. On http://www.apple.com/iphone/features/#performance they have two H1 tags within the same page. One for an image on top and one for text midway on the page. **The image tag is ** Picking up where amazing left off. **The text tag is ** **Siri. The intelligent assistant that helps you get things done. All you have to do is ask.** Having two H1 tags in on the same page does not make sense at all and is against SEO best practices but including an image in the H1 tags ? Does anyone know any benefits of doing this? Thanks in advance for all your help.
Intermediate & Advanced SEO | | DRTBA0