Why is Google Reporting big increase in duplicate content after Canonicalization update?
-
Our web hosting company recently applied a update to our site that should have rectified Canonicalized URLs. Webmaster tools had been reporting duplicate content on pages that had a query string on the end.
After the update there has been a massive jump in Webmaster tools reporting now over 800 pages of duplicate content, Up from about 100 prior to the update plus it reporting some very odd pages (see attached image)
They claim they have implement Canonicalization in line with Google Panda & Penguin, but surely something is not right here and it's going to cause us a big problem with traffic.
Can anyone shed any light on the situation???
-
Hi All,
I finally got to the bottom of the problem and it is that they have not applied canonicalization across the site, only to certain pages which is not my understanding when they implemented the update a few weeks back.
So they are preparing a hot fix as part of a service pack to our site which will rectify this issue and apply canonicalization to all pages that contain query strings. This should clear that problem up once and for all.
Thank you both for your input, a great help.
-
Hi Deb... I have nice blogpost from seomoz blog for you written by Lindsey in which she has explained it very nicely about it.
http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions
In this post check the example of digg.com. Digg.com has blocked "submit" in robots.txt but still Google has indexed URLs. Check screenshot in the Blog post. Hope this help.
-
_Those URLs will be crawled by Google, but will not be Indexed. And that being said, there will be no more duplicate content issue. I hope I have made myself clear over here. _
-
Deb, even if you block those URLs in Robots.txt, Google will going to index those URLs because those URLs are interlink with website. The best way is to put canonical tag so that you will get inter linking benefits as well.
-
Fraser,
Till now they have not implemented Canonicalization in your website. After Canonicalization implementation also you will duplication errors in your webmaster account but it will not harm your ranking. Because Canonicalization helps Google in selecting the page from multiple version of similar page that has to displayed in SERP. In above example, First URL is the original URL but the second URL has some parameters in URLs so your preferred version of URL should be first one. After proper Canonicalization implementation you will only see URLs that you have submitted in your sitemap via Google Webmaster Tool.
And about two webmaster codes, I don't think we have setup two separate accounts, you can provide view or admin access from your webmaster account to them.
-
Either you will have to block these pages via Google Webmaster Tools by Using URL parameter or else you need to block them via robots.txt file like this –
To block this URL: http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100
You need to use this tag in robots.txt file – Disallow: /.htm?dir=
-
Hi,
Here are a couple of examples for you.
Duplication issue is showing because of below type of URLs:
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100 ```
-
The Canonical URL updates were supposed to have been implement some weeks back.
I have asked why there are 2 webmaster tools codes, I expect this is my account plus they have one to monitor things there end.
Query string parameters have been setup, but I am unsure if they are configured correctly as this is all a bit new to me and i am in there hands to deal with this really.
The URLs without query strings are submitted to Webmaster tools via site maps and they are the URLs we want indexed.
-
Can you please share the URL and some example pages where the problem of duplicate content is appearing?
-
Hi Fraser,
Are you talking about towelsrus.co.uk ? I didn't find any canonical tag in any source page of your website. Are they sure about implementation ? or they will implement it in future. And one more interesting point, why there are two webmaster code in your website's source page. Below are those to webmaster codes:
<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">BJ6cDrRRB2iS4fMx2zkZTouKTPTpECs2tw-3OAvIgh4</a>" />
<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">SjaHRLJh00aeQY9xJ81lorL_07UXcCDFgDFgG8lBqCk</a>" />
Have you blocked querystring parameters in "URL parameters" in Google webmaster
Tools ?
Duplication issue is showing because of below type of URLs:
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100
No canonical tag found on above URLs as well.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Upper and lower case URLS coming up as duplicate content
Hey guys and gals, I'm having a frustrating time with an issue. Our site has around 10 pages that are coming up as duplicate content/ duplicate title. I'm not sure what I can do to fix this. I was going to attempt to 301 direct the upper case to lower but I'm worried how this will affect our SEO. can anyone offer some insight on what I should be doing? Update: What I'm trying to figure out is what I should do for our URL's. For example, when I run an audit I'm getting two different pages: aaa.com/BusinessAgreement.com and also aaa.com/businessagreement.com. We don't have two pages but for some reason, Google thinks we do.
Intermediate & Advanced SEO | | davidmac1 -
Content update on 24hr schedule
Hello! I have a website with over 1300 landings pages for specific products. These individual pages update on a 24hr cycle through out API. Our API pulls reviews/ratings from other sources and then writes/updates that content onto the page. Is that 'bad"? Can that be viewed as spammy or dangerous in the eyes of google? (My first thought is no, its fine) Is there such a thing as "too much content". For example if we are adding roughly 20 articles to our site a week, is that ok? (I know news websites add much more than that on a daily basis but I just figured I would ask) On that note, would it be better to stagger our posting? For example 20 articles each week for a total of 80 articles, or 80 articles once a month? (I feel like trickle posting is probably preferable but I figured I would ask.) Is there any negatives to the process of an API writing/updating content? Should we have 800+ words of static content on each page? Thank you all mozzers!
Intermediate & Advanced SEO | | HashtagHustler0 -
Search console, duplicate content and Moz
Hi, Working on a site that has duplicate content in the following manner: http://domain.com/content
Intermediate & Advanced SEO | | paulneuteboom
http://www.domain.com/content Question: would telling search console to treat one of them as the primary site also stop Moz from seeing this as duplicate content? Thanks in advance, Best, Paul. http0 -
Duplicate Content Dilemma for Category and Brand Pages
Hi, I have a online shop with categories such as: Trousers Shirts Shoes etc. But now I'm having a problem with further development.
Intermediate & Advanced SEO | | soralsokal
I'd like to introduce brand pages. In this case I would create new categories for Brand 1, Brand 2, etc... The text on categories and brand pages would be unique. But there will be an overlap in products. How do I deal with this from a duplicate content perspective? I'm appreciate your suggestions. Best, Robin0 -
I'm updating content that is out of date. What is the best way to handle if I want to keep old content as well?
So here is the situation. I'm working on a site that offers "Best Of" Top 10 list type content. They have a list that ranks very well but is out of date. They'd like to create a new list for 2014, but have the old list exist. Ideally the new list would replace the old list in search results. Here's what I'm thinking, but let me know if you think theres a better way to handle this: Put a "View New List" banner on the old page Make sure all internal links point to the new page Rel=canonical tag on the old list pointing to the new list Does this seem like a reasonable way to handle this?
Intermediate & Advanced SEO | | jim_shook0 -
Duplicate on page content - Product descriptions - Should I Meta NOINDEX?
Hi, Our e-commerce store has a lot of product descriptions duplicated - Some of them are default manufacturer descriptions, some are descriptions because the colour of the product varies - so essentially the same product, just different colour. It is going to take a lot of man hours to get the unique content in place - would a Meta No INDEX on the dupe pages be ok for the moment and then I can lift that once we have unique content in place? I can't 301 or canonicalize these pages, as they are actually individual products in their own right, just dupe descriptions. Thanks, Ben
Intermediate & Advanced SEO | | bjs20101 -
Dynamic 301's causing duplicate content
Hi, wonder if anyone can help? We have just changed our site which was hosted on IIS and the page url's were like this ( example.co.uk/Default.aspx?pagename=About-Us ). The new page url is example.co.uk/About-Us/ and is using Apache. The 301's our developer told us to use was in this format: RewriteCond %{REQUEST_URI} ^/Default.aspx$
Intermediate & Advanced SEO | | GoGroup51
RewriteCond %{QUERY_STRING} ^pagename=About-Us$
RewriteRule ^(.*)$ http://www.domain.co.uk/About-Us/ [R=301,L] This seemed to work from a 301 point of view; however it also seemed to allow both of the below URL's to give the same page! example.co.uk/About-Us/?pagename=About-Us example.co.uk/About-Us/ Webmaster Tools has now picked up on this and is seeing it a duplicate content. Can anyone help why it would be doing this please. I'm not totally clued up and our host/ developer cant understand it too. Many Thanks0 -
Duplicate Content on Wordpress b/c of Pagination
On my recent crawl, there were a great many duplicate content penalties. The site is http://dailyfantasybaseball.org. The issue is: There's only one post per page. Therefore, because of wordpress's (or genesis's) pagination, a page gets created for every post, thereby leaving basically every piece of content i write as a duplicate. I feel like the engines should be smart enough to figure out what's going on, but if not, I will get hammered. What should I do moving forward? Thanks!
Intermediate & Advanced SEO | | Byron_W0