Tools to scan entire site for duplicate content?
-
HI guys,
Just wondering if anyone knows of any tools to scan a site for duplicate content (with other sites on the web).
Looking to quickly identify product pages containing duplicate content/duplicate product descriptions for E-commerce based websites.
I know copy scape can which can check up to 10,000 pages in a single operation with Batch Search. But just wondering if there is anything else on the market i should consider looking at?
Cheers,
Chris
-
Hi Jay,
Have you tried any of these : http://www.copyscape.com/originality-check/ or http://www.siteliner.com/ . Hope it helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Http vs. https - duplicate content
Hi I have recently come across a new issue on our site, where https & http titles are showing as duplicate. I read https://moz.com/community/q/duplicate-content-and-http-and-https however, am wondering as https is now a ranking factor, blocked this can't be a good thing? We aren't in a position to roll out https everywhere, so what would be the best thing to do next? I thought about implementing canonicals? Thank you
Intermediate & Advanced SEO | | BeckyKey0 -
Multiply domains and duplicate content confusion
I've just found out that a client has multiple domains which are being indexed by google and so leading me to worry that they will be penalised for duplicate content. Wondered if anyone could confirm a) are we likely to be penalised? and b) what should we do about it? (i'm thinking just 301 redirect each domain to the main www.clientdomain.com...?). Actual domain = www.clientdomain.com But these also exist: www.hostmastr.clientdomain.com www.pop.clientdomain.com www.subscribers.clientdomain.com www.www2.clientdomain.com www.wwwww.clientdomain.com ps I have NO idea how/why all these domains exist I really appreciate any expertise on this issue, many thanks!
Intermediate & Advanced SEO | | bisibee10 -
Duplicate content within sections of a page but not full page duplicate content
Hi, I am working on a website redesign and the client offers several services and within those services some elements of the services crossover with one another. For example, they offer a service called Modelling and when you click onto that page several elements that build up that service are featured, so in this case 'mentoring'. Now mentoring is common to other services therefore will feature on other service pages. The page will feature a mixture of unique content to that service and small sections of duplicate content and I'm not sure how to treat this. One thing we have come up with is take the user through to a unique page to host all the content however some features do not warrant a page being created for this. Another idea is to have the feature pop up with inline content. Any thoughts/experience on this would be much appreciated.
Intermediate & Advanced SEO | | J_Sinclair0 -
Same content pages in different versions of Google - is it duplicate>
Here's my issue I have the same page twice for content but on different url for the country, for example: www.example.com/gb/page/ and www.example.com/us/page So one for USA and one for Great Britain. Or it could be a subdomain gb. or us. etc. Now is it duplicate content is US version indexes the page and UK indexes other page (same content different url), the UK search engine will only see the UK page and the US the us page, different urls but same content. Is this bad for the panda update? or does this get away with it? People suggest it is ok and good for localised search for an international website - im not so sure. Really appreciate advice.
Intermediate & Advanced SEO | | pauledwards0 -
About robots.txt for resolve Duplicate content
I have a trouble with Duplicate content and title, i try to many way to resolve them but because of the web code so i am still in problem. I decide to use robots.txt to block contents that are duplicate. The first Question: How do i use command in robots.txt to block all of URL like this: http://vietnamfoodtour.com/foodcourses/Cooking-School/
Intermediate & Advanced SEO | | magician
http://vietnamfoodtour.com/foodcourses/Cooking-Class/ ....... User-agent: * Disallow: /foodcourses ( Is that right? ) And the parameter URL: h
ttp://vietnamfoodtour.com/?mod=vietnamfood&page=2
http://vietnamfoodtour.com/?mod=vietnamfood&page=3
http://vietnamfoodtour.com/?mod=vietnamfood&page=4 User-agent: * Disallow: /?mod=vietnamfood ( Is that right? i have folder contain module, could i use: disallow:/module/*) The 2nd question is: Which is the priority " robots.txt" or " meta robot"? If i use robots.txt to block URL, but in that URL my meta robot is "index, follow"0 -
What is the best way to allow content to be used on other sites for syndication without taking the chance of duplicate content filters
Cookstr appears to be syndicating content to shape.com and mensfitness.com a) They integrate their data into partner sites with an attribution back to their site and skinned it with the partners look. b) they link the image back to their image hosted on cookstr c) The page does not have microformats or as much data as their own page does so their own page is better SEO. Is this the best strategy or is there something better they could be doing to safely allow others to use our content, we don't want to share the content if we're going to get hit for a duplicate content filter or have another site out rank us with our own data. Thanks for your help in advance! their original content page: http://www.cookstr.com/recipes/sauteacuteed-escarole-with-pancetta their syndicated content pages: http://www.shape.com/healthy-eating/healthy-recipes/recipe/sauteacuteed-escarole-with-pancetta
Intermediate & Advanced SEO | | irvingw
http://www.mensfitness.com/nutrition/healthy-recipes/recipe/sauteacuteed-escarole-with-pancetta0 -
Duplicate content for images
On SEOmoz I am getting duplicate errors for my onsite report. Unfortunately it does not specify what that content is... We are getting these errors for our photo gallery and i am assuming that the reason is some of the photos are listed in multiple categories. Can this be the problem? what else can it be? how can we resolve these issues?
Intermediate & Advanced SEO | | SEODinosaur0 -
How do I fix the error duplicate page content and duplicate page title?
On my site www.millsheating.co.uk I have the error message as per the question title. The conflict is coming from these two pages which are effectively the same page: www.millsheating.co.uk www.millsheating.co.uk/index I have added a htaccess file to the root folder as I thought (hoped) it would fix the problem but I doesn't appear to have done so. this is the content of the htaccess file: Options +FollowSymLinks RewriteEngine On RewriteCond %{HTTP_HOST} ^millsheating.co.uk RewriteRule (.*) http://www.millsheating.co.uk/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/ RewriteRule ^index\.html$ http://www.millsheating.co.uk/ [R=301,L] AddType x-mapp-php5 .php
Intermediate & Advanced SEO | | JasonHegarty0