Identifying Duplicate Content
-
Hi looking for tools (beside Copyscape or Grammarly) which can scan a list of URLs (e.g. 100 pages) and find duplicate content quite quickly.
Specifically, small batches of duplicate content, see attached image as an example.
Does anyone have any suggestions?
Cheers.
-
I'm going to recommend Screaming Frog here. Run a scan of your site and then filter it by duplicate title tags, duplicate meta descriptions, and (my favorite) word count. Usually I don't need to go any further than duplicate title tags.
There's also www.siteliner.com. I've used that regularly and it has been tremendously helpful for pages that have duplicate content in the body but not in the META.
Finally, Google Search Console. Go to Search Appearance and click on HTML Improvements. You can also find all your duplicate title tags there, which should help you identify duplicate content easily.
-
Exactly what i was looking for!
Thankyou.
-
Hi Jay! Great question here.
First of all, kudos to you for looking to kill duplicate content with fire. As a marketer but foremost a writer, I am all about great writing and not doing this duplicated/spun stuff to try to rank. It won't convert anyways.
I put out a call to my followers on Twitter and one of them recommended https://www.killduplicate.com/en. I haven't personally used it, but give it a shot! It comes highly recommended.
Hope that's helpful!
John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
If I have two brands and I market one in English (BrandA.com) and one in Spanish (BrandB.com), and the websites are identical but in different languages, would that have a negative impact on SEO due to duplicate content?
I have a client who wants a website in Spanish and one in English. Typically we would use a multi-language plugin for a single site (brandA.com/en or /es), but this client markets to their Spanish-speaking constituents under a different brand. So I am wondering if we have BrandA.com in English, and the exact same content in Spanish at BrandB.com if there will be negative SEO implications and/or if it will be recognized as duplicate content by search engines?
Intermediate & Advanced SEO | | Designworks-SJ1 -
Shall we add engaging and useful FAQ content in all our pages or rather not because of duplication and reduction of unique content?
We are considering to add at the end of alll our 1500 product pages answers to the 9 most frequently asked questions. These questions and answers will be 90% identical for all our products and personalizing them more is not an option and not so necessary since most questions are related to the process of reserving the product. We are convinced this will increase engagement of users with the page, time on page and it will be genuinely useful for the visitor as most visitors will not visit the seperate FAQ page. Also it will add more related keywords/topics to the page.
Intermediate & Advanced SEO | | lcourse
On the downside it will reduce the percentage of unique content per page and adds duplication. Any thoughts about wether in terms of google rankings we should go ahead and benefits in form of engagement may outweight downside of duplication of content?0 -
Geo-Targeted Sub-Domains & Duplicate Content/Canonical
For background the sub domain structure here is inherited and commited to due to tech restrictions with some of our platforms. The brand I work with is splitting out their global site into regional sub sites (not too relevant but this is in order to display seasonal product in different hemispheres and to link to stores specific to the region). All sub-domains except EU will be geo-targeted to their relevant country. Regions and sub domains for reference: AU - Australia CA - Canada CH - Switzeraland EU - All Euro zone countries NZ - New Zealand US - United States This will be done with Wordpress multisite. The set up allows to publish content on one 'master' sub site and then decide which other sub sites to 'broadcast' to. Some content is specific to a sub-domain/region so no issue with duplicate and can set the sub-site version as canonical. However some content will appear on all sub-domains. au.example.com/awesome-content/ nz.example.com/awesome-content/ Now first question is since these domains are geo-targeted should I just have them all canonical to the version on that sub-domain? eg Or should I still signal the duplicate content with one canonical version? Essentially the top level example.com exists as a site only for publishing purposes - if a user lands on the top level example.com/awesome-content/ they are given a pop up to select region and redirected to the relevant sub-domain version. So I'm also unsure whether I want that content indexed at all?? I could make the top level example.com versions of all content be the canonical that all others point to eg. and rely on geo-targeting to have the right links show in the right search locations. I hope that's kind of clear?? Obviously I find it confusing and therefore hard to relay! Any feedback at all gratefully received. Cheers, Steve
Intermediate & Advanced SEO | | SteveHoney0 -
Duplicate content on URL trailing slash
Hello, Some time ago, we accidentally made changes to our site which modified the way urls in links are generated. At once, trailing slashes were added to many urls (only in links). Links that used to send to
Intermediate & Advanced SEO | | yacpro13
example.com/webpage.html Were now linking to
example.com/webpage.html/ Urls in the xml sitemap remained unchanged (no trailing slash). We started noticing duplicate content (because our site renders the same page with or without the trailing shash). We corrected the problematic php url function so that now, all links on the site link to a url without trailing slash. However, Google had time to index these pages. Is implementing 301 redirects required in this case?1 -
Duplicate Multi-site Content, Duplicate URLs
We have 2 ecommerce sites that are 95% identical. Both sites carry the same 2000 products, and for the most part, have the identical product descriptions. They both have a lot of branded search, and a considerable amount of domain authority. We are in the process of changing out product descriptions so that they are unique. Certain categories of products rank better on one site than another. When we've deployed unique product descriptions on both sites, we've been able to get some double listings on Page 1 of the SERPs. The categories on the sites have different names, and our URL structure is www.domain.com/category-name/sub-category-name/product-name.cfm. So even though the product names are the same, the URLs are different including the category names. We are in the process of flattening our URL structures, eliminating the category and subcategory names from the product URLs: www.domain.com/product-name.cfm. The upshot is that the product URLs will be the same. Is that going to cause us any ranking issues?
Intermediate & Advanced SEO | | AMHC0 -
Duplicate Content... Really?
Hi all, My site is www.actronics.eu Moz reports virtually every product page as duplicate content, flagged as HIGH PRIORITY!. I know why. Moz classes a page as duplicate if >95% content/code similar. There's very little I can do about this as although our products are different, the content is very similar, albeit a few part numbers and vehicle make/model. Here's an example:
Intermediate & Advanced SEO | | seowoody
http://www.actronics.eu/en/shop/audi-a4-8d-b5-1994-2000-abs-ecu-en/bosch-5-3
http://www.actronics.eu/en/shop/bmw-3-series-e36-1990-1998-abs-ecu-en/ate-34-51 Now, multiply this by ~2,000 products X 7 different languages and you'll see we have a big dupe content issue (according to Moz's Crawl Diagnostics report). I say "according to Moz..." as I do not know if this is actually an issue for Google? 90% of our products pages rank, albeit some much better than others? So what is the solution? We're not trying to deceive Google in any way so it would seem unfair to be hit with a dupe content penalty, this is a legit dilemma where our product differ by as little as a part number. One ugly solution would be to remove header / sidebar / footer on our product pages as I've demonstrated here - http://woodberry.me.uk/test-page2-minimal-v2.html since this removes A LOT of page bloat (code) and would bring the page difference down to 80% duplicate.
(This is the tool I'm using for checking http://www.webconfs.com/similar-page-checker.php) Other "prettier" solutions would greatly appreciated. I look forward to hearing your thoughts. Thanks,
Woody 🙂1 -
Duplicate Content From Indexing of non- File Extension Page
Google somehow has indexed a page of mine without the .html extension. so they indexed www.samplepage.com/page, so I am showing duplicate content because Google also see's www.samplepage.com/page.html How can I force google or bing or whoever to only index and see the page including the .html extension? I know people are saying not to use the file extension on pages, but I want to, so please anybody...HELP!!!
Intermediate & Advanced SEO | | WebbyNabler0 -
Is this duplicate content something to be concerned about?
On the 20th February a site I work on took a nose-dive for the main terms I target. Unfortunately I can't provide the url for this site. All links have been developed organically so I have ruled this out as something which could've had an impact. During the past 4 months I've cleaned up all WMT errors and applied appropriate redirects wherever applicable. During this process I noticed that mydomainname.net contained identical content to the main mydomainname.com site. Upon discovering this problem I 301 redirected all .net content to the main .com site. Nothing has changed in terms of rankings since doing this about 3 months ago. I also found paragraphs of duplicate content on other sites (competitors in different countries). Although entire pages haven't been copied there is still enough content to highlight similarities. As this content was written from scratch and Google would've seen this within it's crawl and index process I wanted to get peoples thoughts as to whether this is something I should be concerned about? Many thanks in advance.
Intermediate & Advanced SEO | | bfrl0