Is there a Tool to compare Duplicate content for non web Live content?
-
Is there a tool that can give me % of duplicate content when comparing two pieces of content that are not Live on the web? Like copyscape but for content that may not be indexed by copyscape or not live on the web?
Does Word or any other program allow you do do this?
-
I'm going through some of the older questions, and wondering if you found a solution to your problem, or if you're still looking for some advice. Thanks!
-
I've never seen a percentage similar type option in Word, but you can merge and compare two documents to see the differences. I don't think it'll work enough for your case, it's more helpful for two documents that are in the same order and spotting the differences between them (like a draft proposal and final proposal).
-
Hi Bozzie,
I use WinMerge (open source software) to compare individual files/folders containing text or code.
Also, a quick search for [find similar files] on google brought me numerous software that will let you find similar files on your hard drive.
Best regards,
Guillaume Voyer. -
I haven't tested this, but apparently Google Docs can compare and highlight the differences between two documents - perhaps this is close enough?
-
Can't you make your own private index in Copyscape and compare content against just that?
If you're comparing a lot of pages 1to1 though, I guess that would be tedious.
Compare and merge feature in Word? Not really going to work how I suspect you want though.
Yeah, private copyscape index if it's only a few pieces.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Multiple Countries, Same Language: Receiving Duplicate Page & Content Errors
Hello! I have a site that serves three English-speaking countries, and is using subfolders for each country version: United Kingdom: https://site.com/uk/ Canada: https://site.com/ca/ United States & other English-speaking countries: https://site.com/en/ The site displayed is dependent on where the user is located, and users can also change the country version by using a drop-down flag navigation element in the navigation bar. If a user switches versions using the flag, the first URL of the new language version includes a language parameter in the URL, like: https://site.com/uk/blog?language=en-gb In the Moz crawl diagnostics report, this site is getting dinged for lots of duplicate content because the crawler is finding both versions of each country's site, with and without the language parameter. However, the site has rel="canonical" tags set up on both URL versions and none of the URLs containing the "?language=" parameter are getting indexed. So...my questions: 1. Are the Duplicate Title and Content errors found by the Moz crawl diagnostic really an issue? 2. If they are, how can I best clean this up? Additional notes: the site currently has no sitemaps (XML or HTML), and is not yet using the hreflang tag. I intend to create sitemaps for each country version, like: .com/en/sitemap.xml .com/ca/sitemap.xml .com/uk/sitemap.xml I thought about putting a 'nofollow' tag on the flag navigation element, but since no sitemaps are in place I didn't want to accidentally cut off crawler access to alternate versions. Thanks for your help!
Moz Pro | | Allie_Williams0 -
How can I deal with tag page duplicate issues
The Moz crawler reported some dupliated issues. Many of them have to do with tags.
Moz Pro | | IamKovacs
Each tag has a link, and as some articles are under several tags, these come up as duplicate content. I read Dr Peter's piece on Canonical stuff, but it's not clear to me if any of these are the solution. Perhaps the solution lies somewhere else? Maybe I need to block the robots from these urls (But that seems counter-SEO-productive) Thanks
Kovacs0 -
Keyword rankings tool is not working properly
My website http://www.logobite.com/ is in 29th position for the keyword "logo inspiration" but your keyword rankings tool is not showing up 😞 why?
Moz Pro | | logobite0 -
"Does not respond to web requests" error
When trying to set up a new campaign I get the following message:
Moz Pro | | bshanahan
"Roger has detected a problem: We have detected that the domain www.chicagofinancialadvisers.com does not respond to web requests. Using this domain, we will be unable to crawl your site or present accurate SERP information." Can someone please tell me what I need to do on my site to make this work? I haven't seen this before and have done many other campaigns. Thanks a lot!0 -
Canonical URLs and Duplicate Page Content
My website (doctor directory) is getting a lot of duplicate page content & duplicate page title warnings from SEOmoz. The pages that are getting the warnings are doctors profiles which can be accessed at three different URLs. Problem is this should be handled by the canonical tag on the pages. So example below, all three open the same page: https://www.arzttermine.de/arzt/dr-sara-danesh/ https://www.arzttermine.de/arzt/dr-sara-danesh/gkv https://www.arzttermine.de/arzt/dr-sara-danesh/pkv Here's our canonical tag (on line 34): rel="canonical" href="http://www.arzttermine.de/arzt/dr-sara-danesh" /> So why is SEO moz crawling the page? We are getting hundreds of errors from this - and yet Google doesn't have any of the duplicate URLs indexed...
Moz Pro | | thomashillard0 -
Crawl reports urls with duplicate content but its not the case
Hi guys!
Moz Pro | | MakMour
Some hours ago I received my crawl report. I noticed several records with urls with duplicate content so I went to open those urls one by one.
Not one of those urls were really with duplicate content but I have a concern because website is about product showcase and many articles are just images with href behind them. Many of those articles are using the same images so maybe thats why the seomoz crawler duplicate content flag is raised. I wonder if Google has problem with that too. See for yourself how it looks like: http://by.vg/NJ97y
http://by.vg/BQypE Those two url's are flagged as duplicates...please mind the language(Greek) and try to focus on the urls and content. ps: my example is simplified just for the purpose of my question. <colgroup><col width="3436"></colgroup>
| URLs with Duplicate Page Content (up to 5) |0 -
Confused on www vs non-www
Hey Everyone... Really new to the SEO world and have learned tons each day. When I joined SEOmoz I went to my host and set up the 301 direct to have frogfanreport.com go to www.frogfanreport.com. After a couple of days I noticed that Rogerbot only crawled 1 page on www.frogfanreport.com. Looked into the community posts to try to find an answer. So, I went in and took the 301 direct off and setup a new campaign just for frogfanreport.com. It has now crawled over 300 pages. Not sure what I need to do or if I just did not set it up the 301 direct correctly. Looking at the link stats the root domain stats are obviously the same. The subdomain stats is where there is a big difference: www: ext f links 1, total ext links 5, total links 5, f root domain 1, total linking root domain 4 non-www: ext f links 76, total ext links 109, total links 7.962, f root domain 11, total link root domain 19 I am guessing that I should go back in and put the 301 direct from www to non-www? Is this going to affect RogerBot going in? Or did I just not set it up correctly? zach
Moz Pro | | TCUFrogFanReport0 -
Duplicate Content and Titles in SEOMoz reports
I've had to rename some of the pages on my site and also move them to different locations. I placed a rel="canonical" on the old page pointing to the new one. The reports on my PRO Dashboard are telling me that I have Duplicate Content and Page Title errors. Do the SEOMoz automated reports take the rel="canonical" link into consideration or do I need to remove these pages and do a 301 redirect from the old to the new page?
Moz Pro | | TRICORSystems0