Which duplicate content should I remove?
-
I have duplicate content and am trying to figure out which URL to remove. What should I take into consideration?
Authority?
How close to the root the page is?
How clear the path is?
Would appreciate your help! Thanks!
-
Hi,
Let me back up a step. When you say remove duplicate content, do you mean delete the page? For example, pageA.html and pageB.html have the same content. In your example, are you asking do you delete pageA.html or delete pageB.html? I wouldn't delete either page as both pages likely have traffic, links, etc.
Instead of deleting, you want to pick which page is the canonical version of that page. More simply, when somebody Googles the terms that page is targeting, do you want pageA.html to show up or do you pageB.html to show up? If you decide you want pageA.html to show up, then you need to redirect pageB.html to pageA.html. That way, if somebody goes to pageB.html they end up on pageA.html. If you delete pageB.html and somebody goes there, they'll end up on an error page.
Now, on where question...in terms of deciding what version is canonical, I would suggest you look at authority first and foremost. If the domain authority in OSE is higher for pageA.html than it is for pageB.html, keeping pageA.html makes sense. Things like URL structure are only mildly important, especially when making decisions like this.
Hope that helps. Thanks,
Matthew
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
International SEO and duplicate content: what should I do when hreflangs are not enough?
Hi, A follow up question from another one I had a couple of months ago: It has been almost 2 months now that my hreflangs are in place. Google recognises them well and GSC is cleaned (no hreflang errors). Though I've seen some positive changes, I'm quite far from sorting that duplicate content issue completely and some entire sub-folders remain hidden from the SERP.
Intermediate & Advanced SEO | | GhillC
I believe it happens for two reasons: 1. Fully mirrored content - as per the link to my previous question above, some parts of the site I'm working on are 100% similar. Quite a "gravity issue" here as there is nothing I can do to fix the site architecture nor to get bespoke content in place. 2. Sub-folders "authority". I'm guessing that Google prefers sub-folders over others due to their legacy traffic/history. Meaning that even with hreflangs in place, the older sub-folder would rank over the right one because Google believes it provides better results to its users. Two questions from these reasons:
1. Is the latter correct? Am I guessing correctly re "sub-folders" authority (if such thing exists) or am I simply wrong? 2. Can I solve this using canonical tags?
Instead of trying to fix and "promote" hidden sub-folders, I'm thinking to actually reinforce the results I'm getting from stronger sub-folders.
I.e: if a user based in belgium is Googling something relating to my site, the site.com/fr/ subfolder shows up instead of the site.com/be/fr/ sub-sub-folder.
Or if someone is based in Belgium using Dutch, he would get site.com/nl/ results instead of the site.com/be/nl/ sub-sub-folder. Therefore, I could canonicalise /be/fr/ to /fr/ and do something similar for that second one. I'd prefer traffic coming to the right part of the site for tracking and analytic reasons. However, instead of trying to move mountain by changing Google's behaviour (if ever I could do this?), I'm thinking to encourage the current flow (also because it's not completely wrong as it brings traffic to pages featuring the correct language no matter what). That second question is the main reason why I'm looking out for MoZ's community advice: am I going to damage the site badly by using canonical tags that way? Thank you so much!
G0 -
[E-commerce] Duplicate content due to color variations (canonical/indexing)
Hello, We currently have a lot of color variations on multiple products with almost the same content. Even with our canonicals being set, Moz's crawling tool seems to flag them as duplicate content. What we have done so far: Choosing the best-selling color variation (our "master product") Adding a rel="canonical" to every variation (with our "master product" as the canonical URL) In my opinion, it should be enough to address this issue. However, being given the fact that it's flagged as duplicate by Moz, I was wondering if there is something else we should do? Should we add a "noindex,follow" to our child products and "index,follow" to our master product? (sounds to me like such a heavy change) Thank you in advance
Intermediate & Advanced SEO | | EasyLounge0 -
Duplicate Content and Titles
Hi Mozzers, I saw a considerable amount of duplicate content and page titles on our clients website. We are just implementing a fix in the CMS to make sure that these are all fixed. What changes do you think I could see in terms of rankings?
Intermediate & Advanced SEO | | KarlBantleman0 -
Artist Bios on Multiple Pages: Duplicate Content or not?
I am currently working on an eComm site for a company that sells art prints. On each print's page, there is a bio about the artist followed by a couple of paragraphs about the print. My concern is that some artists have hundreds of prints on this site, and the bio is reprinted on every page,which makes sense from a usability standpoint, but I am concerned that it will trigger a duplicate content penalty from Google. Some people are trying to convince me that Google won't penalize for this content, since the intent is not to game the SERPs. However, I'm not confident that this isn't being penalized already, or that it won't be in the near future. Because it is just a section of text that is duplicated, but the rest of the text on each page is original, I can't use the rel=canonical tag. I've thought about putting each artist bio into a graphic, but that is a huge undertaking, and not the most elegant solution. Could I put the bio on a separate page with only the artist's info and then place that data on each print page using an <iframe>and then put a noindex,nofollow in the robots.txt file?</p> <p>Is there a better solution? Is this effort even necessary?</p> <p>Thoughts?</p></iframe>
Intermediate & Advanced SEO | | sbaylor0 -
PDF for link building - avoiding duplicate content
Hello, We've got an article that we're turning into a PDF. Both the article and the PDF will be on our site. This PDF is a good, thorough piece of content on how to choose a product. We're going to strip out all of the links to our in the article and create this PDF so that it will be good for people to reference and even print. Then we're going to do link building through outreach since people will find the article and PDF useful. My question is, how do I use rel="canonical" to make sure that the article and PDF aren't duplicate content? Thanks.
Intermediate & Advanced SEO | | BobGW0 -
Finding Duplicate Content Spanning more than one Site?
Hi forum, SEOMoz's crawler identifies duplicate content within your own site, which is great. How can I compare my site to another site to see if they share "duplicate content?" Thanks!
Intermediate & Advanced SEO | | Travis-W0 -
Duplicate Content Warning For Pages That Do Not Exist
Hi Guys I am hoping someone can help me out here. I have had a new site built with a unique theme and using wordpress as the CMS. Everything was going fine but after checking webmaster tools today I noticed something that I just cannot get my head around. Basically I am getting warnings of Duplicate page warnings on a couple of things. 1 of which i think i can understand but do not know how to get the warning to go. Firstly I get this warning of duplicate meta desciption url 1: / url 2: /about/who-we-are I understand this as the who-we-are page is set as the homepage through the wordpress reading settings. But is there a way to make the dup meta description warning disappear The second one I am getting is the following: /services/57/ /services/ Both urls lead to the same place although I have never created the services/57/ page the services/57/ page does not show on the xml sitemap but Google obviously see it because it is a warning in webmaster tools. If I press edit on services/57/ page it just goes to edit the /services/ page/ is there a way I can remove the /57/ page safely or a method to ensure Google at least does not see this. Probably a silly question but I cannot find a real comprehensive answer to sorting this. Thanks in advance
Intermediate & Advanced SEO | | southcoasthost0 -
Duplicate Content/ Indexing Question
I have a real estate Wordpress site that uses an IDX provider to add real estate listings to my site. A new page is created as a new property comes to market and then the page is deleted when the property is sold. I like the functionality of the service but it creates a significant amount of 404's and I'm also concerned about duplicate content because anyone else using the same service here in Las Vegas will have 1000's of the exact same property pages that I do. Any thoughts on this and is there a way that I can have the search engines only index the core 20 pages of my site and ignore future property pages? Your advice is greatly appreciated. See link for example http://www.mylvcondosales.com/mandarin-las-vegas/
Intermediate & Advanced SEO | | AnthonyLasVegas0