Craw Diagnostics Questions

niallfred

SEO Moz is reporting that I have 50+ pages with a duplicate content issue based on this URL: http://www. f r e d aldous.co.uk/art-shop/art-supplies/art-canvas.html?manufacturer=178

But I have included this tag in the source: rel="canonical" href="http://www.f r e daldous.co.uk/art-shop/art-supplies/art-canvas.html"/>

(I have purposefully added white space to the URLs in this message as I'm not sure about the rules for posting links here)

I though this "canonical" tag prevented the duplicate content being indexed?

is the reporting by SEOMoz wrong or being over cautious?

Cyrus-Shepard

Hi Niall,

This isn't a case of the canonical tag being properly applied, but a case where two or more pages are so similar in code that they are setting off the SEOmoz duplicate content flags.

First of all, those pages look different to us humans. But the SEOmoz web app uses a similarity threshold of 95% of the html code. This takes everything on the page, both hidden and visible into account.

In this case, it's counting all of the navigation and sidebar as well, which is significant. What's left of the unique content - the part that matters, makes up less than 5% of the code.

Here's a tool you can use to check the similarity: http://www.duplicatecontent.net/

I ran the pages through a couple of tools which showed 98% HTML similarity. And 99% text similarity.

For perspective, take a look at Google's cached versions of one of these pages. This is how googlebot sees the page: http://webcache.googleusercontent.com/search?q=cache:mdybPKIjOxUJ:www.fredaldous.co.uk/craft-shop/general-crafts.html+http://www.fredaldous.co.uk/craft-shop/general-crafts.html&hl=en&gl=us&strip=1

That, as we say, is a lot of links!

Since Panda, when I see a site with this many navigation links, I usually advise them to restructure their site architecture into more of a Pyramid shape, so that you reduce the overall navigation on each page.

Hope this helps! Best of luck with your SEO.

niallfred

It claims that this is one of the duplicate URLS:

http://www.f r e daldous.co.uk/photo-gift/design-led-gifts.html?manufacturer=436

Now I am confused as page is no where near duplicate content of the URL I posted 1st.

Can anyone explain this?

SEOExecutive20

Helo Niall,

It seems that you have inserted the rel="canonical" href= in the correct spot. I think the software is giving you the potentials which is always a bonus precaution. I really don't want to make a premature determination without knowing which 50 pages are showing up as duplicate. A deeper look will allow me to give you a more accurate response.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Craw Diagnostics Questions

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Unsolved Question about a Screaming Frog crawling issue

Launching large content project - date-stamp question

Subdomain question for law firm in Indiana, Michigan, and New Mexico.

Question about construction of our sitemap URL in robots.txt file

Crawl Diagnostics Report 500 erorr

Windows IIS 7 Redirect Question

Title Element Too Long Question

Robots.txt question