Why is Google Reporting big increase in duplicate content after Canonicalization update?

Towelsrus

Our web hosting company recently applied a update to our site that should have rectified Canonicalized URLs. Webmaster tools had been reporting duplicate content on pages that had a query string on the end.

After the update there has been a massive jump in Webmaster tools reporting now over 800 pages of duplicate content, Up from about 100 prior to the update plus it reporting some very odd pages (see attached image)

They claim they have implement Canonicalization in line with Google Panda & Penguin, but surely something is not right here and it's going to cause us a big problem with traffic.

Can anyone shed any light on the situation???

Duplicate%20Content.jpg

Towelsrus

Hi All,

I finally got to the bottom of the problem and it is that they have not applied canonicalization across the site, only to certain pages which is not my understanding when they implemented the update a few weeks back.

So they are preparing a hot fix as part of a service pack to our site which will rectify this issue and apply canonicalization to all pages that contain query strings. This should clear that problem up once and for all.

Thank you both for your input, a great help.

SanketPatel

Hi Deb... I have nice blogpost from seomoz blog for you written by Lindsey in which she has explained it very nicely about it.

http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions

In this post check the example of digg.com. Digg.com has blocked "submit" in robots.txt but still Google has indexed URLs. Check screenshot in the Blog post. Hope this help.

Debdulal

_Those URLs will be crawled by Google, but will not be Indexed. And that being said, there will be no more duplicate content issue. I hope I have made myself clear over here. _

SanketPatel

Deb, even if you block those URLs in Robots.txt, Google will going to index those URLs because those URLs are interlink with website. The best way is to put canonical tag so that you will get inter linking benefits as well.

SanketPatel

Fraser,

Till now they have not implemented Canonicalization in your website. After Canonicalization implementation also you will duplication errors in your webmaster account but it will not harm your ranking. Because Canonicalization helps Google in selecting the page from multiple version of similar page that has to displayed in SERP. In above example, First URL is the original URL but the second URL has some parameters in URLs so your preferred version of URL should be first one. After proper Canonicalization implementation you will only see URLs that you have submitted in your sitemap via Google Webmaster Tool.

And about two webmaster codes, I don't think we have setup two separate accounts, you can provide view or admin access from your webmaster account to them.

Debdulal

Either you will have to block these pages via Google Webmaster Tools by Using URL parameter or else you need to block them via robots.txt file like this –

To block this URL: http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100

You need to use this tag in robots.txt file – Disallow: /.htm?dir=

Towelsrus

Hi,

Here are a couple of examples for you.

Duplication issue is showing because of below type of URLs:

http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm

http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100

```

Towelsrus

Yes www.towelsrus.co.uk

The Canonical URL updates were supposed to have been implement some weeks back.

I have asked why there are 2 webmaster tools codes, I expect this is my account plus they have one to monitor things there end.

Query string parameters have been setup, but I am unsure if they are configured correctly as this is all a bit new to me and i am in there hands to deal with this really.

The URLs without query strings are submitted to Webmaster tools via site maps and they are the URLs we want indexed.

Debdulal

Can you please share the URL and some example pages where the problem of duplicate content is appearing?

SanketPatel

Hi Fraser,

Are you talking about towelsrus.co.uk ? I didn't find any canonical tag in any source page of your website. Are they sure about implementation ? or they will implement it in future. And one more interesting point, why there are two webmaster code in your website's source page. Below are those to webmaster codes:

<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">BJ6cDrRRB2iS4fMx2zkZTouKTPTpECs2tw-3OAvIgh4</a>" />

<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">SjaHRLJh00aeQY9xJ81lorL_07UXcCDFgDFgG8lBqCk</a>" />

Have you blocked querystring parameters in "URL parameters" in Google webmaster

Tools ?

Duplication issue is showing because of below type of URLs:

http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm

http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100

No canonical tag found on above URLs as well.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Why is Google Reporting big increase in duplicate content after Canonicalization update?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Is there a way to make Google realize/detect scraper content?

Publishing pages with thin content, update later?

Category Pages For Distributing Authority But Not Creating Duplicate Content

Does > help Google to see content as a citation and not a duplicate?

Duplicate content on subdomains.

Duplicate content resulting from js redirect?

Do you bother cleaning duplicate content from Googles Index?

How to manage duplicate content?