Duplicate Content Mystery
-
Hi Moz community!
I have an ongoing duplicate mystery going on here and I'm hoping someone here can answer my question.
We have an Ecommerce site that has a variety of product pages and category pages. There are Rel canonicals in place, along with parameters in GWT, and there are also URL rewrites.
Here are some scenarios, maybe you can give insight as to what’s exactly going on and how to fix it.
All the duplicates look to be coming from category pages specifically.
For example:
This link re-writes:To:
http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html
The rel canonical tag looks like this:
http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html" />
The CONTENT is different, but the URLs are the same. It thinks that the product category view is the same as the all products view, even though there is a canonical in there telling it which one is the original. Some of them don’t have anything to do with each other.
Take a look:
Link identified as duplicate:
Link this is a duplicate of:
http://www.incipio.com/cases/macbook-cases/macbook-pro-13in-cases.html
Any idea as to what could be happening here?
-
Hi Ishwar,
If you have done so yet it would be best to create your own post. Many people pop in here to help others and when they see this topic as answered they may not look at it. Creating your own post will get the most attention.
-
Hi Nicole,
Okay so the reason I stated that it appears something is improperly installed is due to the fact a page should in general have 1 head tag, 1 title tag, 1 body tag and 1 document type declaration. Your page has the normal ones you'd expect to see plus another set.
In the code I posted above you have an Iframe, which is basically a tag that says display information from a different source. In this case it is Google, which is fine but it should not contain another set of head, title, and body tags along with a document declaration. Google would never do that. This along with my years of experience looking at and installing ad-ons leads me to believe that something was installed incorrectly or at the very least not coded correctly.
As to the misconfiguration issue, I would look first at how my url rewrites are being done as there is no viable reason the first link you posted should rewrite to a url and serve different content than what is suppose to be there. That tells me that the re-writes are being incorrectly handled.
I hope that helps a little,
Don
-
Hello Moz Communtiy!
i am also having error of Duplicate Tag Content Mystery like:
http://www.earnmoneywithgoogleadsense.com/tag/blog-post/
http://www.earnmoneywithgoogleadsense.com/tag/effective-blog-post/
Pages are same. I have 100+ Error on website so how can i remove this error? DO you have any tutorial based on this?
Can i change canonical url at once or i need to set it one by one
-
Hi Donford,
Thanks so much for getting back to me. Great answer! I'd like some clarification here. I did not configure this and if I'm going to talk to the developer, I'd like to have more knowledge to speak to it.
Could you please clarify what you mean when you say:
- It looks like something is installed and configured improperly.
- You have 2 head tags on the page that shows up from the redirect.
- This is actually inside the first head tag complete with a body tag and another doc declaration.
I looked at the example you sent, but I'm not sure what I'm looking at. If you could explain those bullet points in more detail, it would greatly help.
You're the best!
Thanks,
Nicole
-
It looks like something is installed and configured improperly.
You have 2 head tags on the page that shows up from the redirect.
This is actually inside the first head tag complete with a body tag and another doc declaration.
<iframe id="oauth2relay579972146" name="oauth2relay579972146" src="https://accounts.google.com/o/oauth2/postmessageRelay?parent=http%3A%2F%2Fwww.incipio.com#rpctoken=728288212&forcesecure=1" style="width: 1px; height: 1px; position: absolute; top: -100px;" tabindex="-1">
<html><head><title>title><meta content="text/html; charset=utf-8" http-equiv="content-type"><meta content="IE=edge" http-equiv="X-UA-Compatible"><meta content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, user-scalable=0" name="viewport"><script src="https://apis.google.com/js/api.js" type="text/javascript" gapi_processed="true"><script src="https://oauth.googleusercontent.com/gadgets/js/core:rpc:shindig.random:shindig.sha1.js?c=2" type="text/javascript"><script src="https://ssl.gstatic.com/accounts/o/3417060037-postmessagerelay.js">head><body>html>iframe>
That looks like an installation issue.
-
Now the misconfiguration issue would have to be why the URL re-writes to page but serves up different content.
-
And lastly I think even if you fix those issues you're still going to get duplicate content warnings because you have very thin content on pages.
-
Example: Page 1 http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves/amazon-kindle-fire-hd-6-cases.html
-
Example: Page 2 http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves/amazon-kindle-fire-hd-7-cases.html
-
On those 2 pages there is a 1 character difference 6 instead of 7. All the other content (header & footer) and 1 letter difference. Than if you go to the actual product page you have the exact same issue same description to the letter except the one number. Yep, you're going to have a duplicate content problem.
-
This is something that all e-commerce stores face. You honestly need to write unique content for each and every product you sell. Don't copy & paste stuff from another site like Amazon or the manufacturers site, write your own content.
-
In summation, I would recheck any modules/ad-ons/plug-ins you installed as one appears to be incorrect. if that doesn't' fix the re-write issue have a developer that is familiar with your ecommerce platform look at this issue. Lastly, you got to have unique content.
-
Maybe not the best news but I hope it helps
-
Don
Edit in bullet points to try and make the post a look a little better. These forums don't take kindly to adding code blocks
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Footer Content Issue
Please check given screenshot URL. As per the screenshot we are using highlighted content through out the website in the footer section of our website (https://www.mastersindia.co/) . So, please tell us how Google will treat this content. Will Google count it as duplicate content or not? What is the solution in case if the Google treat it as duplicate content. Screenshot URL: https://prnt.sc/pmvumv
Technical SEO | | AnilTanwarMI0 -
WordPress Duplicate Content Caused By Categories
Hello, We have a wordpress blog that has around 250 categories. Due to our platform we have a hierarchy structure for 3 separate stores. For example iPhone > Apps > Books. Placing a blog post in the books category automatically places it into iPhone and iPhone/Apps category, causing 3 instances of any blog post in this category. Is this an issue? I have seen 2 schools of thought on categories, 1 index follow and 2 noindex follow. I know some of our categories get indexed, but with so many, maybe it is better to noindex them. We also considered reducing our categories to 10 to 12 and use tags to provide the indexed site navigation as follows: Reviews (category) iPhone Book App, iPhone App Store (tags) but this seems a little redundant? Anyone want to take this on? thank you Mike
Technical SEO | | crazymikesapps10 -
Development Website Duplicate Content Issue
Hi, We launched a client's website around 7th January 2013 (http://rollerbannerscheap.co.uk), we originally constructed the website on a development domain (http://dev.rollerbannerscheap.co.uk) which was active for around 6-8 months (the dev site was unblocked from search engines for the first 3-4 months, but then blocked again) before we migrated dev --> live. In late Jan 2013 changed the robots.txt file to allow search engines to index the website. A week later I accidentally logged into the DEV website and also changed the robots.txt file to allow the search engines to index it. This obviously caused a duplicate content issue as both sites were identical. I realised what I had done a couple of days later and blocked the dev site from the search engines with the robots.txt file. Most of the pages from the dev site had been de-indexed from Google apart from 3, the home page (dev.rollerbannerscheap.co.uk, and two blog pages). The live site has 184 pages indexed in Google. So I thought the last 3 dev pages would disappear after a few weeks. I checked back late February and the 3 dev site pages were still indexed in Google. I decided to 301 redirect the dev site to the live site to tell Google to rank the live site and to ignore the dev site content. I also checked the robots.txt file on the dev site and this was blocking search engines too. But still the dev site is being found in Google wherever the live site should be found. When I do find the dev site in Google it displays this; Roller Banners Cheap » admin dev.rollerbannerscheap.co.uk/ A description for this result is not available because of this site's robots.txt – learn more. This is really affecting our clients SEO plan and we can't seem to remove the dev site or rank the live site in Google. In GWT I have tried to remove the sub domain. When I visit remove URLs, I enter dev.rollerbannerscheap.co.uk but then it displays the URL as http://www.rollerbannerscheap.co.uk/dev.rollerbannerscheap.co.uk. I want to remove a sub domain not a page. Can anyone help please?
Technical SEO | | SO_UK0 -
Techniques for diagnosing duplicate content
Buonjourno from Wetherby UK 🙂 Diagnosing duplicate content is a classic SEO skill but I'm curious to know what techniques other people use. Personally i use webmaster tools as illustrated here: http://i216.photobucket.com/albums/cc53/zymurgy_bucket/webmaster-tools-duplicate.jpg but what other techniques are effective? Thanks,
Technical SEO | | Nightwing
David0 -
Does turning website content into PDFs for document sharing sites cause duplicate content?
Website content is 9 tutorials published to unique urls with a contents page linking to each lesson. If I make a PDF version for distribution of document sharing websites, will it create a duplicate content issue? The objective is to get a half decent link, traffic to supplementary opt-in downloads.
Technical SEO | | designquotes0 -
Whats with the backslash in the url adding as duplicate content?
Is this a bug or something that needs to be addressed? If so, just use a redirect?
Technical SEO | | Boogily0 -
Strange duplicate content issue
Hi there, SEOmoz crawler has identified a set of duplicate content that we are struggling to resolve. For example, the crawler picked up that this page www. creative - choices.co.uk/industry-insight/article/Advice-for-a-freelance-career is a duplicate of this page www. creative - choices.co.uk/develop-your-career/article/Advice-for-a-freelance-career. The latter page's content is the original and can be found in the CMS admin area whilst the former page is the duplicate and has no entry in the CMS. So we don't know where to begin if the "duplicate" page doesn't exist in the CMS. The crawler states that this page www. creative-choices.co.uk/industry-insight/inside/creative-writing is the referrer page. Looking at it, only the original page's link is showing on the referrer page, so how did the crawler get to the duplicate page?
Technical SEO | | CreativeChoices0 -
Duplicate content check picking up weird urls
Hi everyone, I love the duplicate content feature; we have a lot of duplicate content issues due to the way our site is structured. So, we're working on them. However, I'm not fully understanding the results. For example, say I have an article on breast cancer symptoms. It shows up as duplicate content, by having two urls that point to the exact same page. http://www.healthchoices.ca/articles/breast cancer symptoms and http://www.healthchoices.ca/somerandomstringofcode. I fully understand why that is duplicate content. I am not sure about this though, it picks up the same url twice and calls it duplicate content. For example, saying that http://www.healthchoices.ca/dr.-so-and-so and http://www.healthchoices.ca/dr.-so-and-so is duplicate...however is this not the same page? Is there something I'm missing? Many of the URL's are identical. Thanks, Erin
Technical SEO | | erinhealthchoices0