Why is Google Reporting big increase in duplicate content after Canonicalization update?
-
Our web hosting company recently applied a update to our site that should have rectified Canonicalized URLs. Webmaster tools had been reporting duplicate content on pages that had a query string on the end.
After the update there has been a massive jump in Webmaster tools reporting now over 800 pages of duplicate content, Up from about 100 prior to the update plus it reporting some very odd pages (see attached image)
They claim they have implement Canonicalization in line with Google Panda & Penguin, but surely something is not right here and it's going to cause us a big problem with traffic.
Can anyone shed any light on the situation???
-
Hi All,
I finally got to the bottom of the problem and it is that they have not applied canonicalization across the site, only to certain pages which is not my understanding when they implemented the update a few weeks back.
So they are preparing a hot fix as part of a service pack to our site which will rectify this issue and apply canonicalization to all pages that contain query strings. This should clear that problem up once and for all.
Thank you both for your input, a great help.
-
Hi Deb... I have nice blogpost from seomoz blog for you written by Lindsey in which she has explained it very nicely about it.
http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions
In this post check the example of digg.com. Digg.com has blocked "submit" in robots.txt but still Google has indexed URLs. Check screenshot in the Blog post. Hope this help.
-
_Those URLs will be crawled by Google, but will not be Indexed. And that being said, there will be no more duplicate content issue. I hope I have made myself clear over here. _
-
Deb, even if you block those URLs in Robots.txt, Google will going to index those URLs because those URLs are interlink with website. The best way is to put canonical tag so that you will get inter linking benefits as well.
-
Fraser,
Till now they have not implemented Canonicalization in your website. After Canonicalization implementation also you will duplication errors in your webmaster account but it will not harm your ranking. Because Canonicalization helps Google in selecting the page from multiple version of similar page that has to displayed in SERP. In above example, First URL is the original URL but the second URL has some parameters in URLs so your preferred version of URL should be first one. After proper Canonicalization implementation you will only see URLs that you have submitted in your sitemap via Google Webmaster Tool.
And about two webmaster codes, I don't think we have setup two separate accounts, you can provide view or admin access from your webmaster account to them.
-
Either you will have to block these pages via Google Webmaster Tools by Using URL parameter or else you need to block them via robots.txt file like this –
To block this URL: http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100
You need to use this tag in robots.txt file – Disallow: /.htm?dir=
-
Hi,
Here are a couple of examples for you.
Duplication issue is showing because of below type of URLs:
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100 ```
-
The Canonical URL updates were supposed to have been implement some weeks back.
I have asked why there are 2 webmaster tools codes, I expect this is my account plus they have one to monitor things there end.
Query string parameters have been setup, but I am unsure if they are configured correctly as this is all a bit new to me and i am in there hands to deal with this really.
The URLs without query strings are submitted to Webmaster tools via site maps and they are the URLs we want indexed.
-
Can you please share the URL and some example pages where the problem of duplicate content is appearing?
-
Hi Fraser,
Are you talking about towelsrus.co.uk ? I didn't find any canonical tag in any source page of your website. Are they sure about implementation ? or they will implement it in future. And one more interesting point, why there are two webmaster code in your website's source page. Below are those to webmaster codes:
<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">BJ6cDrRRB2iS4fMx2zkZTouKTPTpECs2tw-3OAvIgh4</a>" />
<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">SjaHRLJh00aeQY9xJ81lorL_07UXcCDFgDFgG8lBqCk</a>" />
Have you blocked querystring parameters in "URL parameters" in Google webmaster
Tools ?
Duplication issue is showing because of below type of URLs:
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100
No canonical tag found on above URLs as well.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content Strategy/Duplicate Content Issue, rel=canonical question
Hi Mozzers: We have a client who regularly pays to have high-quality content produced for their company blog. When I say 'high quality' I mean 1000 - 2000 word posts written to a technical audience by a lawyer. We recently found out that, prior to the content going on their blog, they're shipping it off to two syndication sites, both of which slap rel=canonical on them. By the time the content makes it to the blog, it has probably appeared in two other places. What are some thoughts about how 'awful' a practice this is? Of course, I'm arguing to them that the ranking of the content on their blog is bound to be suffering and that, at least, they should post to their own site first and, if at all, only post to other sites several weeks out. Does anyone have deeper thinking about this?
Intermediate & Advanced SEO | | Daaveey0 -
Duplicate content on recruitment website
Hi everyone, It seems that Panda 4.2 has hit some industries more than others. I just started working on a website, that has no manual action, but the organic traffic has dropped massively in the last few months. Their external linking profile seems to be fine, but I suspect usability issues, especially the duplication may be the reason. The website is a recruitment website in a specific industry only. However, they posts jobs for their clients, that can be very similar, and in the same time they can have 20 jobs with the same title and very similar job descriptions. The website currently have over 200 pages with potential duplicate content. Additionally, these jobs get posted on job portals, with the same content (Happens automatically through a feed). The questions here are: How bad would this be for the website usability, and would it be the reason the traffic went down? Is this the affect of Panda 4.2 that is still rolling What can be done to resolve these issues? Thank you in advance.
Intermediate & Advanced SEO | | iQi0 -
Penalized for Similar, But Not Duplicate, Content?
I have multiple product landing pages that feature very similar, but not duplicate, content and am wondering if this would affect my rankings in a negative way. The main reason for the similar content is three-fold: Continuity of site structure across different products Similar, or the same, product add-ons or support options (resulting in exactly the same additional tabs of content) The product itself is very similar with 3-4 key differences. Three examples of these similar pages are here - although I do have different meta-data and keyword optimization through the pages. http://www.1099pro.com/prod1099pro.asp http://www.1099pro.com/prod1099proEnt.asp http://www.1099pro.com/prodW2pro.asp
Intermediate & Advanced SEO | | Stew2220 -
Duplicate content across internation urls
We have a large site with 1,000+ pages of content to launch in the UK. Much of this content is already being used on a .nz url which is going to stay. Do you see this as an issue or do you thin Google will take localised factoring into consideration. We could add a link from the NZ pages to the UK. We cant noindex the pages as this is not an option. Thanks
Intermediate & Advanced SEO | | jazavide0 -
Duplicate Content / 301 redirect Ariticle issue
Hello, We've got some articles floating around on our site nlpca(dot)com like this article: http://www.nlpca.com/what-is-dynamic-spin-release.html that's is not linked to from anywhere else. The article exists how it's supposed to be here: http://www.dynamicspinrelease.com/what-is-dsr/ (our other website) Would it be safe in eyes of both google's algorithm (as much as you know) and with Panda to just 301 redirect from http://www.nlpca.com/what-is-dynamic-spin-release.html to http://www.dynamicspinrelease.com/what-is-dsr/ or would no-indexing be better? Thank you!
Intermediate & Advanced SEO | | BobGW0 -
Cross-Domain Canonical and duplicate content
Hi Mozfans! I'm working on seo for one of my new clients and it's a job site (i call the site: Site A).
Intermediate & Advanced SEO | | MaartenvandenBos
The thing is that the client has about 3 sites with the same Jobs on it. I'm pointing a duplicate content problem, only the thing is the jobs on the other sites must stay there. So the client doesn't want to remove them. There is a other (non ranking) reason why. Can i solve the duplicate content problem with a cross-domain canonical?
The client wants to rank well with the site i'm working on (Site A). Thanks! Rand did a whiteboard friday about Cross-Domain Canonical
http://www.seomoz.org/blog/cross-domain-canonical-the-new-301-whiteboard-friday0 -
I have a duplicate content problem
The website guy that made the website for my business Premier Martial Arts Austin disappeared and didn't set up that www. was to begin each URL, so I now have a duplicate content problem and don't want to be penalized for it. I tried to show in Webmaster tools the preferred setup but can't get it to OK that I'm the website owner. Any idea as what to do?
Intermediate & Advanced SEO | | OhYeahSteve0