Why does SEOmoz bot see duplicate pages despite I am using the canonical tag?
-
Hello here,
today SEOmoz bot found and marked as "duplicate content" the following pages on my website:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf
And I am wondering why considering the fact I am using on both those pages a canonical tag pointing to the main product page below:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html
Shouldn't SEOmoz bot follow the canonical directive and not report those two pages as duplicate?
Thank you for any insights I am probably missing here!
-
Thank you Peter, I got your ticket reply.
That makes perfect sense, and as Dr. Peter pointed out on a different thread:
http://www.seomoz.org/q/why-seomoz-bot-consider-these-as-duplicate-pages
I was discussing this issue further, I was confused by your report.
Thank you again for your help and I hope you will improve your report interface to avoid such confusion related issues in the future.
Best,
Fabrizio
-
Hi there,
Thanks for reaching out to us, I replied to you in a support ticket, but I just wanted to share it everyone since I think it might be relevant to this discussion.
I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing, you can see the duplicate pages by clicking on the number to the right side of the link. These pages are considered duplicates because their canonical tags point to different URLs. For example:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3(Duplicate 1) is considered a duplicate of
http://www.virtualsheetmusic.com/score/PatrickCollectionVcPf.html?tab=mp3 (Duplicate 2)because the canonical tag for the first page is CANON1(http://screencast.com/t/tqvDZrLsyz8D) while the canonical for the second URL is CANON2 (http://screencast.com/t/FOguPJmK0).
Since the canonical tags point to different pages it is assumed that CANON1 and CANON2 are likely to be duplicates themselves.
Here is how our system interprets duplicate content vs. rel canonical:
Assuming A, B, C, and D are all duplicates,
If A references B as the canonical, then they are not considered duplicates
If A and B both reference C as canonical, A and B are not considered duplicates of each other
If A references C as a canonical, A and B are considered duplicated
If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.Hope that helps,
Best,
Peter
SEOmoz Help Team. -
Thinking furthermore, I don't see how these pages can be considered nearly duplicate since their content is quite different:
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3
http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf
Thoughts??!!
-
Nobody can tell me why SEOmoz ignore my canonical tag definitions? According to some comments on the following thread:
http://www.seomoz.org/blog/visualizing-duplicate-web-pages
It should actually ignore pages with a canonical tag and NOT mark them as duplicate, but in my experience (as explained above), that's not been the case.
-
Ok, thank you, now I get the point... then here is my next question: is there a way to tell SEOmoz bot to ignore duplicate page with a defined canonical tag? If not, the SEOmoz duplicate page report is useless for me. I am not interested to know about duplicate page for which I have already defined a canonical tag for.
Thanks!
-
Canonical lets you pick which of the duplicates will be indexed. But Google still has to crawl the other pages when they could be crawling other parts of your site. It's an opportunity cost. If you can accept slower crawls, you can ignore the issue.
-
I am sorry, but I don't understand your point. If two pages are similar, we can use the canonical tag to "consolidate" them and avoid duplicate issues. Am I right? Or what are canonical tags for?
-
While I agree that SEOMOZ should better categorize duplicates that are canonical, the reason they still tell you it's duplicate is crawl budget. Remember, Google still has to crawl these duplicate pages and they could be crawling something else instead. Canonical only helps by letting you pick which duplicate content gets indexed. It's better to not have duplicate content than to have canonical duplicates.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do I need to use a trailing slash to homepage in canonical and hreflang?
Currently I have a 301 redirect from
Intermediate & Advanced SEO | | lcourse
https://www.mysite.com/
to
https://www.mysite.com And in my canonical and hreflang and also insite links I use consistently https://www.mysite.com without trailing slash. Is this OK? Or do I need to add a trailing slash?0 -
Duplicate Pages #!
Hi guys, Currently have duplicate pages accross a website e.g. https://archierose.com.au/shop/cart**#!** https://archierose.com.au/shop/cart The only difference is the URL 1 has a hashtag and exclamation tag. Everything else is the same. We were thinking of adding rel canonical tags on the #! versions of the page to the correct URLs. But Google doens't seem to be indexing the #! versions anyway. Does anyone know why this is the case? If Google is not indexing them, is there any point adding rel canonical tags? Cheers, Chris https://archierose.com.au/shop/cart#!
Intermediate & Advanced SEO | | jayoliverwright0 -
Parameter Strings & Duplicate Page Content
I'm managing a site that has thousands of pages due to all of the dynamic parameter strings that are being generated. It's a real estate listing site that allows people to create a listing, and is generating lots of new listings everyday. The Moz crawl report is continually flagging A LOT (25k+) of the site pages for duplicate content due to all of these parameter string URLs. Example: sitename.com/listings & sitename.com/listings/?addr=street name Do I really need to do anything about those pages? I have researched the topic quite a bit, but can't seem to find anything too concrete as to what the best course of action is. My original thinking was to add the rel=canonical tag to each of the main URLs that have parameters attached. I have also read that you can bypass that by telling Google what parameters to ignore in Webmaster tools. We want these listings to show up in search results, though, so I don't know if either of these options is ideal, since each would cause the listing pages (pages with parameter strings) to stop being indexed, right? Which is why I'm wondering if doing nothing at all will hurt the site? I should also mention that I originally recommend the rel=canonical option to the web developer, who has pushed back in saying that "search engines ignore parameter strings." Naturally, he doesn't want the extra work load of setting up the canonical tags, which I can understand, but I want to make sure I'm both giving him the most feasible option for implementation as well as the best option to fix the issues.
Intermediate & Advanced SEO | | garrettkite0 -
Using Canonical URL to poin to an external page
I was wondering if I can use a canonical URL that points to a page residing on external site? So a page like:
Intermediate & Advanced SEO | | llamb
www.site1.com/whatever.html will have a canonical link in its header to www.site2.com/whatever.html. Thanks.0 -
Penalized for Duplicate Page Content?
I have some high priority notices regarding duplicate page content on my website www.3000doorhangers.com Most of the pages listed here are on our sample pages: http://www.3000doorhangers.com/home/door-hanger-pricing/door-hanger-design-samples/ On the left side of our page you can go through the different categories. Most of the category pages have similar text. We mainly just changed the industry on each page. Is this something that google would penalize us for? Should I go through all the pages and use completely unique text for each page? Any suggestions would be helpful Thanks! Andrea
Intermediate & Advanced SEO | | JimDirectMailCoach0 -
Canonical use when dynamically placing items on "all products" page
Hi all, We're trying to get our canonical situation straightened out. We have a section of our site with 100 product pages in it (in our case a city with hotels that we've reviewed), and we have a single page where we list them all out--an "all products" page called "all.html." However, because we have 100 and that's a lot for a user to see at once, we plan to first show only 50 on "all.html." When the user scrolls down to the bottom, we use AJAX to place another 50 on the page (these come from another page called "more.html" and are placed onto "all.html"). So, as you scroll down from the front end, you see "all.html" with 100 listings. We have other listings pages that are sorted and filtered subsets of this list with little or no unique content. Thus, we want to place a canonical on those pages. Question: Should the canonical point to "all.html"? Would spiders get confused, because they see that all.html is only half the listings? Is it dangerous to dynamically place content on a page that's used as a canonical? Is this a non-issue? Thanks, Tom
Intermediate & Advanced SEO | | TomNYC0 -
Does rel=canonical fix duplicate page titles?
I implemented rel=canonical on our pages which helped a lot, but my latest Moz crawl is still showing lots of duplicate page titles (2,000+). There are other ways to get to this page (depending on what feature you clicked, it will have a different URL) but will have the same page title. Does having rel=canonical in place fix the duplicate page title problem, or do I need to change something else? I was under the impression that the canonical tag would address this by telling the crawler which URL was the URL and the crawler would only use that one for the page title.
Intermediate & Advanced SEO | | askotzko0 -
Why duplicate content for same page?
Hi, My SEOMOZ crawl diagnostic warn me about duplicate content. However, to me the content is not duplicated. For instance it would give me something like: (URLs/Internal Links/External Links/Page Authority/Linking Root Domains) http://www.nuxeo.com/en/about/contact?utm_source=enews&utm_medium=email&utm_campaign=enews20110516 /1/1/31/2 http://www.nuxeo.com/en/about/contact?utm_source=enews&utm_medium=email&utm_campaign=enews20110711 0/0/1/0 http://www.nuxeo.com/en/about/contact?utm_source=enews&utm_medium=email&utm_campaign=enews20110811 0/0/1/0 http://www.nuxeo.com/en/about/contact?utm_source=enews&utm_medium=email&utm_campaign=enews20110911 0/0/1/0 Why is this seen as duplicate content when it is only URL with campaign tracking codes to the same content? Do I need to clean this?Thanks for answer
Intermediate & Advanced SEO | | nuxeo0