Why does Crawl Diagnostics report this as duplicate content?
-
Hi guys,
we've been addressing a duplicate content problem on our site over the past few weeks. Lately, we've implemented rel canonical tags in various parts of our ecommerce store, over time, and observing the effects by both tracking changes in SEOMoz and Websmater tools.
Although our duplicate content errors are definitely decreasing, I can't help but wonder why some URLs are still being flagged with duplicate content by our SEOmoz crawler.
Here's an example, taken directly from our Crawl Diagnostics Report:
URL with 4 Duplicate Content errors:
/safety-lights.htmlDuplicate content URLs:
/safety-lights.html ?cat=78&price=-100
/safety-lights.html?cat=78&dir=desc&order=position /safety-lights.html?cat=78 /safety-lights.html?manufacturer=514What I don't understand, is all of the URLS with URL parameters have a rel canonical tag pointing to the 'real' URL
/safety-lights.htmlSo why is SEOMoz crawler still flagging this as duplicate content?
-
So glad I could help get this figured out! Sometimes it just takes another set of eyes.
-Chiaryn
-
Good catch Chiaryn! Totally didn't see this.
Essentially two URLs end up displaying the same content: 1 is the URL that's picked up by google from our XML sitemap, and the other is a dynamic URL with filtering parameters based on a one level higher category URL.
The canonical tags were set up in such a way that they point to the base category, which in this case, are different, even though the content is the same.
We will address this.
Thanks!
-
Hi there,
I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing. These pages are considered duplicates because their canonical tags point to different URLs. For example, accessories/lights.html?cat=78&price=-100 is considered a duplicate of accessories/lights/safety-lights.html?manufacturer=514 because the canonical tag for the first page is accessories/lights.html while the canonical for the second URL is accessories/lights/safety-lights.html.
Since the canonical tags point to different pages it is assumed that accessories/lights.html and accessories/lights/safety-lights.html are likely to be duplicates themselves.
Here is how our system interprets duplicate content vs. rel canonical:
Assuming A, B, C, and D are all duplicates,
- If A references B as the canonical, then they are not considered duplicates
- If A and B both reference C as canonical, A and B are not considered duplicates of each other
- If A references C as a canonical, A and B are considered duplicated
- If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.
I hope this clears things up. Please let me know if you have any other questions.
-Chiaryn
-
Does seem a little odd. Could you post the domain so we can have a more detailed look?
Thanks
Iain - Reload Media
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it normal for Moz to report on nofollow pages in crawl diagnostics?
I have a dev version of my website, for example, devwww.website.com. The htaccess page has a noindex and nofollow request, but I got crawl issues reported from these pages in my Moz report. Does this mean that I don't have the development site hidden from search like I thought I did?
Moz Pro | | houstonbrooke0 -
Crawl Diagnostics Summary Problem
We added our website a Robots.txt file and there are pages blocked by robots.txt. Crawl Diagnostics Summary page shows there is no page blocked by Robots.txt. Why?
Moz Pro | | iskq0 -
Duplicate Content
Crawl Diagnostics is returning duplicate content/title tags for every product image on listing pages of my classified site because each image is on a separate url. So this page, for example, http://marketplace.myclassicgarage.com/cars/all/Chevrolet-Bel-Air/24481/ has, among other things, the same title tag as all this page, http://marketplace.myclassicgarage.com/cars/all/Chevrolet-Bel-Air/24481/media/151968 which is one of many different images that are all child pages in the folder /media In this particular case there are over 140 pages with the same title tag because there are over 140 images for this particular car. That is just one listing and there are over 1,000 listings (vehicles) and that number will grow. Is this really a problem? With limited resources, what real positive effect will making all these images have unique title tags really have from a SERP perspective? Keep in mind this being user generated content, there is no way to descriptively update the title tags to something like <title>Bel Air Passenger Side Profile</title>. That is not feasible.
Moz Pro | | MyClassicGarage0 -
How often does seomoz crawl the site? Can you force a crawl at a specific time ?
How often does seomoz crawl the site? Can you force a crawl at a specific time ?
Moz Pro | | stewbuch18720 -
After I make corrections of my crawl diagnostics report, how can I tell is those corrections "took". Is there a way to immediatly refresh that report. Will it eventually refresh?'
I have made corrections to the crawl diagnostics report. Can I refresh this report? I would like to see if my corrections were correct. Thanks for your anticipated answer!
Moz Pro | | Bob550 -
Who wants to help go over my crawl diagnostics via skype?
I have run a crawl diagnostic on my site and have 194 errors and most of them are 404 errors in wordpress. Not sure why, but many of my pages had name changes (possibly a permalinks issue) but I have no idea how to fix it. I had 5 duplicate page titles, and 1 tile missing or empty. 72 crawl notices found (2 permanent redirect, 17 blocked by robots, 53 rel canonical) 19 Crawl warnings were found Who wants to have some fun?
Moz Pro | | starkSEO0 -
Why Is SEOMOZ No Longer crawling All Of My Site
Hi all, I joined Seomoz over a month ago and Roger has been crawling all of the pages on the site approx 20 pages. Through out the last few weeks I have been working on the errors and notices identified by Roger. However, this week Roger has only re-crawled 1 page and is not picking up all the other pages. Has any one come across this problem. can you recommend any thing to resolve it? Many thanks in advance....
Moz Pro | | Dan280 -
Crawl still in progress ...
Hi guys, New crawl on one of my campaigns is still in progress since November 27th, i didn't get new data since November 19th 2011 ... What should i do ?
Moz Pro | | DavidEichholtzer0