Identifying Duplicate Content
-
Hi looking for tools (beside Copyscape or Grammarly) which can scan a list of URLs (e.g. 100 pages) and find duplicate content quite quickly.
Specifically, small batches of duplicate content, see attached image as an example.
Does anyone have any suggestions?
Cheers.
-
I'm going to recommend Screaming Frog here. Run a scan of your site and then filter it by duplicate title tags, duplicate meta descriptions, and (my favorite) word count. Usually I don't need to go any further than duplicate title tags.
There's also www.siteliner.com. I've used that regularly and it has been tremendously helpful for pages that have duplicate content in the body but not in the META.
Finally, Google Search Console. Go to Search Appearance and click on HTML Improvements. You can also find all your duplicate title tags there, which should help you identify duplicate content easily.
-
Exactly what i was looking for!
Thankyou.
-
Hi Jay! Great question here.
First of all, kudos to you for looking to kill duplicate content with fire. As a marketer but foremost a writer, I am all about great writing and not doing this duplicated/spun stuff to try to rank. It won't convert anyways.
I put out a call to my followers on Twitter and one of them recommended https://www.killduplicate.com/en. I haven't personally used it, but give it a shot! It comes highly recommended.
Hope that's helpful!
John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Http vs. https - duplicate content
Hi I have recently come across a new issue on our site, where https & http titles are showing as duplicate. I read https://moz.com/community/q/duplicate-content-and-http-and-https however, am wondering as https is now a ranking factor, blocked this can't be a good thing? We aren't in a position to roll out https everywhere, so what would be the best thing to do next? I thought about implementing canonicals? Thank you
Intermediate & Advanced SEO | | BeckyKey0 -
Penalties for duplicate content
Hello!We have a website with various city tours and activities listed on a single page (http://vaiduokliai.lt/). The list changes accordingly depending on filtering (birthday in Vilnius, bachelor party in Kaunas, etc.). The URL doesn't change. Content changes dynamically. We need to make URL visible for each category, then optimize it for different keywords (for example city tours in Vilnius for a list of tours and activities in Vilnius with appropriate URL /tours-in-Vilnius).The problem is that activities overlap very often in different categories, so there will be a lot of duplicate content on different pages. In such case, how severe penalty could be for duplicate content?
Intermediate & Advanced SEO | | jpuzakov0 -
Medical / Health Content Authority - Content Mix Question
Greetings, I have an interesting challenge for you. Well, I suppose "interesting" is an understatement, but here goes. Our company is a women's health site. However, over the years our content mix has grown to nearly 50/50 between unique health / medical content and general lifestyle/DIY/well being content (non-health). Basically, there is a "great divide" between health and non-health content. As you can imagine, this has put a serious damper on gaining ground with our medical / health organic traffic. It's my understanding that Google does not see us as an authority site with regard to medical / health content since we "have two faces" in the eyes of Google. My recommendation is to create a new domain and separate the content entirely so that one domain is focused exclusively on health / medical while the other focuses on general lifestyle/DIY/well being. Because health / medical pages undergo an additional level of scrutiny per Google - YMYL pages - it seems to me the only way to make serious ground in this hyper-competitive vertical is to be laser targeted with our health/medical content. I see no other way. Am I thinking clearly here, or have I totally gone insane? Thanks in advance for any reply. Kind regards, Eric
Intermediate & Advanced SEO | | Eric_Lifescript0 -
How can a website have multiple pages of duplicate content - still rank?
Can you have a website with multiple pages of the exact same copy, (being different locations of a franchise business), and still be able to rank for each individual franchise? Is that possible?
Intermediate & Advanced SEO | | OhYeahSteve0 -
How should I manage duplicate content caused by a guided navigation for my e-commerce site?
I am working with a company which uses Endeca to power the guided navigation for our e-commerce site. I am concerned that the duplicate content generated by having the same products served under numerous refinement levels is damaging the sites ability to rank well, and was hoping the Moz community could help me understand how much of an impact this type of duplicate content could be having. I also would love to know if there are any best practices for how to manage this type of navigation. Should I nofollow all of the URLs which have more than 1 refinement used on a category, or should I allow the search engines to go deeper than that to preserve the long tail? Any help would be appreciated. Thank you.
Intermediate & Advanced SEO | | FireMountainGems0 -
HTTPS Duplicate Content?
I just recieved a error notification because our website is both http and https. http://www.quicklearn.com & https://www.quicklearn.com. My tech tells me that this isn't actually a problem? Is that true? If not, how can I address the duplicate content issue?
Intermediate & Advanced SEO | | QuickLearnTraining0 -
Accepting RSS feeds. Does it = duplicate content?
Hi everyone, for a few years now I've allowed school clients to pipe their news RSS feed to their public accounts on my site. The result is a daily display of the most recent news happening on their campuses that my site visitors can browse. We don't republish the entire news item; just the headline, and the first 150 characters of their article along with a Read more link for folks to click if they want the full story over on the school's site. Each item has it's own permanent URL on my site. I'm wondering if this is a wise practice. Does this fall into the territory of duplicate content even though we're essentially providing a teaser for the school? What do you think?
Intermediate & Advanced SEO | | peterdbaron0 -
Load balancing - duplicate content?
Our site switches between www1 and www2 depending on the server load, so (the way I understand it at least) we have two versions of the site. My question is whether the search engines will consider this as duplicate content, and if so, what sort of impact can this have on our SEO efforts? I don't think we've been penalised, (we're still ranking) but our rankings probably aren't as strong as they should be. The SERPs show a mixture of www1 and www2 content when I do a branded search. Also, when I try to use any SEO tools that involve a site crawl I usually encounter problems. Any help is much appreciated!
Intermediate & Advanced SEO | | ChrisHillfd0