Duplicate Content Mystery
-
Hi Moz community!
I have an ongoing duplicate mystery going on here and I'm hoping someone here can answer my question.
We have an Ecommerce site that has a variety of product pages and category pages. There are Rel canonicals in place, along with parameters in GWT, and there are also URL rewrites.
Here are some scenarios, maybe you can give insight as to what’s exactly going on and how to fix it.
All the duplicates look to be coming from category pages specifically.
For example:
This link re-writes:To:
http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html
The rel canonical tag looks like this:
http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html" />
The CONTENT is different, but the URLs are the same. It thinks that the product category view is the same as the all products view, even though there is a canonical in there telling it which one is the original. Some of them don’t have anything to do with each other.
Take a look:
Link identified as duplicate:
Link this is a duplicate of:
http://www.incipio.com/cases/macbook-cases/macbook-pro-13in-cases.html
Any idea as to what could be happening here?
-
Hi Ishwar,
If you have done so yet it would be best to create your own post. Many people pop in here to help others and when they see this topic as answered they may not look at it. Creating your own post will get the most attention.
-
Hi Nicole,
Okay so the reason I stated that it appears something is improperly installed is due to the fact a page should in general have 1 head tag, 1 title tag, 1 body tag and 1 document type declaration. Your page has the normal ones you'd expect to see plus another set.
In the code I posted above you have an Iframe, which is basically a tag that says display information from a different source. In this case it is Google, which is fine but it should not contain another set of head, title, and body tags along with a document declaration. Google would never do that. This along with my years of experience looking at and installing ad-ons leads me to believe that something was installed incorrectly or at the very least not coded correctly.
As to the misconfiguration issue, I would look first at how my url rewrites are being done as there is no viable reason the first link you posted should rewrite to a url and serve different content than what is suppose to be there. That tells me that the re-writes are being incorrectly handled.
I hope that helps a little,
Don
-
Hello Moz Communtiy!
i am also having error of Duplicate Tag Content Mystery like:
http://www.earnmoneywithgoogleadsense.com/tag/blog-post/
http://www.earnmoneywithgoogleadsense.com/tag/effective-blog-post/
Pages are same. I have 100+ Error on website so how can i remove this error? DO you have any tutorial based on this?
Can i change canonical url at once or i need to set it one by one
-
Hi Donford,
Thanks so much for getting back to me. Great answer! I'd like some clarification here. I did not configure this and if I'm going to talk to the developer, I'd like to have more knowledge to speak to it.
Could you please clarify what you mean when you say:
- It looks like something is installed and configured improperly.
- You have 2 head tags on the page that shows up from the redirect.
- This is actually inside the first head tag complete with a body tag and another doc declaration.
I looked at the example you sent, but I'm not sure what I'm looking at. If you could explain those bullet points in more detail, it would greatly help.
You're the best!
Thanks,
Nicole
-
It looks like something is installed and configured improperly.
You have 2 head tags on the page that shows up from the redirect.
This is actually inside the first head tag complete with a body tag and another doc declaration.
<iframe id="oauth2relay579972146" name="oauth2relay579972146" src="https://accounts.google.com/o/oauth2/postmessageRelay?parent=http%3A%2F%2Fwww.incipio.com#rpctoken=728288212&forcesecure=1" style="width: 1px; height: 1px; position: absolute; top: -100px;" tabindex="-1">
<html><head><title>title><meta content="text/html; charset=utf-8" http-equiv="content-type"><meta content="IE=edge" http-equiv="X-UA-Compatible"><meta content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, user-scalable=0" name="viewport"><script src="https://apis.google.com/js/api.js" type="text/javascript" gapi_processed="true"><script src="https://oauth.googleusercontent.com/gadgets/js/core:rpc:shindig.random:shindig.sha1.js?c=2" type="text/javascript"><script src="https://ssl.gstatic.com/accounts/o/3417060037-postmessagerelay.js">head><body>html>iframe>
That looks like an installation issue.
-
Now the misconfiguration issue would have to be why the URL re-writes to page but serves up different content.
-
And lastly I think even if you fix those issues you're still going to get duplicate content warnings because you have very thin content on pages.
-
Example: Page 1 http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves/amazon-kindle-fire-hd-6-cases.html
-
Example: Page 2 http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves/amazon-kindle-fire-hd-7-cases.html
-
On those 2 pages there is a 1 character difference 6 instead of 7. All the other content (header & footer) and 1 letter difference. Than if you go to the actual product page you have the exact same issue same description to the letter except the one number. Yep, you're going to have a duplicate content problem.
-
This is something that all e-commerce stores face. You honestly need to write unique content for each and every product you sell. Don't copy & paste stuff from another site like Amazon or the manufacturers site, write your own content.
-
In summation, I would recheck any modules/ad-ons/plug-ins you installed as one appears to be incorrect. if that doesn't' fix the re-write issue have a developer that is familiar with your ecommerce platform look at this issue. Lastly, you got to have unique content.
-
Maybe not the best news but I hope it helps
-
Don
Edit in bullet points to try and make the post a look a little better. These forums don't take kindly to adding code blocks
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to stop /tag creating duplicate content - Wordpress
Hi, I keep getting alert for duplicate content. It seems Wordpress is creating it through a /tag https://www.curveball-media.co.uk/tag/cipr/ https://www.curveball-media.co.uk/tag/pr-agencies/ Something in the way we've got Wordpress set up?
Technical SEO | | curveballmedia0 -
Duplicate Content
I am trying to get a handle on how to fix and control a large amount of duplicate content I keep getting on my Moz Reports. The main area where this comes up is for duplicate page content and duplicate title tags ... thousands of them. I partially understand the source of the problem. My site mixes free content with content that requires a login. I think if I were to change my crawl settings to eliminate the login and index the paid content it would lower the quantity of duplicate pages and help me identify the true duplicate pages because a large number of duplicates occur at the site login. Unfortunately, it's not simple in my case because last year I encountered a problem when migrating my archives into a new CMS. The app in the CMS that migrated the data caused a large amount of data truncation Which means that I am piecing together my archives of approximately 5,000 articles. It also means that much of the piecing together process requires me to keep the former app that manages the articles to find where certain articles were truncated and to copy the text that followed the truncation and complete the articles. So far, I have restored about half of the archives which is time-consuming tedious work. My question is if anyone knows a more efficient way of identifying and editing duplicate pages and title tags?
Technical SEO | | Prop650 -
Is it possible to deindex old URLs that contain duplicate content?
Our client is a recruitment agency and their website used to contain a substantial amount of duplicate content as many of the listed job descriptions were repeated and recycled. As a result, their rankings rarely progress beyond page 2 on Google. Although they have started using more unique content for each listing, it appears that old job listings pages are still indexed so our assumption is that Google is holding down the ranking due to the amount of duplicate content present (one software returned a score of 43% duplicate content across the website). Looking at other recruitment websites, it appears that they block the actual job listings via the robots.txt file. Would blocking the job listings page from being indexed either by robots.txt or by a noindex tag reduce the negative impact of the duplicate content, but also remove any link juice coming to those pages? In addition, expired job listing URLs stay live which is likely to be increasing the overall duplicate content. Would it be worth removing these pages and setting up 404s, given that any links to these pages would be lost? If these pages are removed, is it possible to permanently deindex these URLs? Any help is greatly appreciated!
Technical SEO | | ClickHub-Harry0 -
How do I avoid this issue of duplicate content with Google?
I have an ecommerce website which sells a product that has many different variations based on a vehicle’s make, model, and year. Currently, we sell this product on one page “www.cargoliner.com/products.php?did=10001” and we show a modal to sort through each make, model, and year. This is important because based on the make, model, and year, we have different prices/configurations for each. For example, for the Jeep Wrangler and Jeep Cherokee, we might have different products: Ultimate Pet Liner - Jeep Wrangler 2011-2013 - $350 Ultimate Pet Liner - Jeep Wrangler 2014 - 2015 - $350 Utlimate Pet Liner - Jeep Cherokee 2011-2015 - $400 Although the typical consumer might think we have 1 product (the Ultimate Pet Liner), we look at these as many different types of products, each with a different configuration and different variants. We do NOT have unique content for each make, model, and year. We have the same content and images for each. When the customer selects their make, model, and year, we just search and replace the text to make it look like the make, model, and year. For example, when a custom selects 2015 Jeep Wrangler from the modal, we do a search and replace so the page will have the same url (www.cargoliner.com/products.php?did=10001) but the product title will say “2015 Jeep Wrangler”. Here’s my problem: We want all of these individual products to have their own unique urls (cargoliner.com/products/2015-jeep-wrangler) so we can reference them in emails to customers and ideally we start creating unique content for them. Our only problem is that there will be hundreds of them and they don’t have unique content other than us switching in the product title and change of variants. Also, we don’t want our url www.cargoliner.com/products.php?did=10001 to lose its link juice. Here’s my question(s): My assumption is that I should just keep my url: www.cargoliner.com/products.php?did=10001 and be able to sort through the products on that page. Then I should go ahead and make individual urls for each of these products (i.e. cargoliner.com/products/2015-jeep-wrangler) but just add a “nofollow noindex” to the page. Is this what I should do? How secure is a “no-follow noindex” on a webpage? Does Google still index? Am I at risk for duplicate content penalties? Thanks!
Technical SEO | | kirbyfike0 -
Query Strings causing Duplicate Content
I am working with a client that has multiple locations across the nation, and they recently merged all of the location sites into one site. To allow the lead capture forms to pre-populate the locations, they are using the query string /?location=cityname on every page. EXAMPLE - www.example.com/product www.example.com/product/?location=nashville www.example.com/product/?location=chicago There are thirty locations across the nation, so, every page x 30 is being flagged as duplicate content... at least in the crawl through MOZ. Does using that query string actually cause a duplicate content problem?
Technical SEO | | Rooted1 -
Finding a specific link - Duplicating my own content
Hi Mozzers, This may be a bit of a n00b question and i feel i should know the answer but alas, here i am asking. I have a page www.website.co.uk/page/ and im getting a duplicate page report of www.website.co.uk/Page/ i know this is because somewhere on my website a link will exists using the capitalised version. I have tried everything i can think of to find it but with no luck, any little tricks? I could always rewrite the urls to lowercase, but I have downloadable software etc also on the website that i dont want to take the capitals out of. So the best solution seems to be finding the link and remove it. Most link checkers I use treat the capitalised and non capitalised as the same thing so really arent helping lol.
Technical SEO | | ATP0 -
Whats with the backslash in the url adding as duplicate content?
Is this a bug or something that needs to be addressed? If so, just use a redirect?
Technical SEO | | Boogily0 -
CGI Parameters: should we worry about duplicate content?
Hi, My question is directed to CGI Parameters. I was able to dig up a bit of content on this but I want to make sure I understand the concept of CGI parameters and how they can affect indexing pages. Here are two pages: No CGI parameter appended to end of the URL: http://www.nytimes.com/2011/04/13/world/asia/13japan.html CGI parameter appended to the end of the URL: http://www.nytimes.com/2011/04/13/world/asia/13japan.html?pagewanted=2&ref=homepage&src=mv Questions: Can we safely say that CGI parameters = URL parameters that append to the end of a URL? Or are they different? And given that you have rel canonical implemented correctly on your pages, search engines will move ahead and index only the URL that is specified in that tag? Thanks in advance for giving your insights. Look forward to your response. Best regards, Jackson
Technical SEO | | jackson_lo0