Finding Duplicate Content Spanning more than one Site?
-
Hi forum, SEOMoz's crawler identifies duplicate content within your own site, which is great. How can I compare my site to another site to see if they share "duplicate content?" Thanks!
-
The Alert thing is great! I use it when we write new content (along with CopyScape after a week or so) just so I can make sure I'm outranking it. lol
-
Yes. I totally agree with Darin. There isn't a duplicate content penalty, per se, and the tools he listed are quite good suggestions as well.
-
IMHO, even if the HTML is different you could have duplicate content if the H1 or paragraph text is substantially similar. However, is this automatically penalized? No. Syndication of content can be quite prevalent on the Web. For example the AP breaks a news story and posts it online and it is subsequently picked up by the New York Times and Wall Street Journal. Wherever the content appeared first, particularly if it has a canonical tag in place, that source will be credited with having the original content. The other sites aren't going to be penalized, but they aren't going to benefit from it either.
Similar things happen on large e-commerce sites all the time. For example, 100's of e-commerce stores sell lightbulbs. Those descriptions are most certainly "substantially similar." It'd be kind of strange if they weren't. They aren't penalized for that.
I hope this is helpful! It is always good to set up a Google Alert for any great pieces of content you do write, just so you can be aware of who might be copying your stuff! (Tynt.com can also be very useful for this).
Good luck!
Dana
-
Just for the record there isn't any "Duplicate Content Penalty" so don't worry to much about this. Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results.
However, to answer your question I use copyscape to do this but you have to insert a URL and not just lines at a time.
Here are some other ones I've heard good things about:
I agree with Dana on the Google thing too. Like she said, "Just be sure to put quotes around your snippet."
-
This helps, thanks Dana. Is the actual paragraph content the main source of a duplicate content penalty? For example, what if the pages share different metadata and the HTML is entirely different except for the H1 text and paragraph content?
-
Hi Zora,
This best way to do this is to grab a random section of text from the page and go to Google, then paste that section of text in the search bar inside "quotes." For example, from your question above, I could search:
"SEOMoz's crawler identifies duplicate content within your own site, which is great. How can I compare my site"
you will see that the result in Google is a result to this page (once it's been indexed, which hasn't happened quite yet) - Just be sure to put quotes around your snippet.
Hope that helps!
Dana
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content with URLs
Hi all, Do you think that is possible to have duplicate content issues because we provide a unique image with 5 different URLs ? In the HTML code pages, just one URL is provide. It's enough for that Google don't see the other URLs or not ? Example, in this article : http://www.parismatch.com/People/Kim-Kardashian-sa-securite-n-a-pas-de-prix-1092112 The same image is available on: http://cdn-parismatch.ladmedia.fr/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize1-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize2-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize3-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg Thank you very much for your help. Julien
Intermediate & Advanced SEO | | Julien.Ferras0 -
Duplicate Content
Hi, So I have my great content (that contains a link to our site) that I want to distribute to high quality relevant sites in my niche as part of a link building campaign. Can I distribute this to lots of sites? The reason I ask is that those sites will then have duplicate content to all the other sites I distribute the content to won;t they? I this duplication bad for them and\or us? Thanks
Intermediate & Advanced SEO | | Studio330 -
Same content pages in different versions of Google - is it duplicate>
Here's my issue I have the same page twice for content but on different url for the country, for example: www.example.com/gb/page/ and www.example.com/us/page So one for USA and one for Great Britain. Or it could be a subdomain gb. or us. etc. Now is it duplicate content is US version indexes the page and UK indexes other page (same content different url), the UK search engine will only see the UK page and the US the us page, different urls but same content. Is this bad for the panda update? or does this get away with it? People suggest it is ok and good for localised search for an international website - im not so sure. Really appreciate advice.
Intermediate & Advanced SEO | | pauledwards0 -
Best-of-the-web content in steep competition, ecommerce site
Hello, I'm helping my client write a long, comprehensive, best-of-the-web piece of content. It's a boring ecommerce niche, but on the informational side the top 10 competitors for the most linked to topic are all big players with huge domain authority. There's not a lot of links in the industry, should I try to top all the big industries through better content (somehow), pictures, illustrations, slideshows with audio, and by being more thorough than these very good competitors? Or should I go for something that's less linked to (maybe 1/5 as much people linking to it) but easier? or both? We're on a short timeline of 3 and 1/2 months until we need traffic and our budget is not huge
Intermediate & Advanced SEO | | BobGW1 -
Issue with duplicate content in blog
I have blog where all the pages r get indexed, with rich content in it. But In blogs tag and category url are also get indexed. i have just added my blog in seomoz pro, and i have checked my Crawl Diagnostics Summary in that its showing me that some of your blog content are same. For Example: www.abcdef.com/watches/cool-watches-of-2012/ these url is already get indexed, but i have asigned some tag and catgeory fo these url also which have also get indexed with the same content. so how shall i stop search engines to do not crawl these tag and categories pages. if i have more no - follow tags in my blog does it gives negative impact to search engines, any alternate way to tell search engines to stop crawling these category and tag pages.
Intermediate & Advanced SEO | | sumit600 -
Duplicate Content / 301 redirect Ariticle issue
Hello, We've got some articles floating around on our site nlpca(dot)com like this article: http://www.nlpca.com/what-is-dynamic-spin-release.html that's is not linked to from anywhere else. The article exists how it's supposed to be here: http://www.dynamicspinrelease.com/what-is-dsr/ (our other website) Would it be safe in eyes of both google's algorithm (as much as you know) and with Panda to just 301 redirect from http://www.nlpca.com/what-is-dynamic-spin-release.html to http://www.dynamicspinrelease.com/what-is-dsr/ or would no-indexing be better? Thank you!
Intermediate & Advanced SEO | | BobGW0 -
Fixing Duplicate Content Errors
SEOMOZ Pro is showing some duplicate content errors and wondered the best way to fix them other than re-writing the content. Should I just remove the pages found or should I set up permanent re-directs through to the home page in case there is any link value or visitors on these duplicate pages? Thanks.
Intermediate & Advanced SEO | | benners0 -
Diagnosing duplicate content issues
We recently made some updates to our site, one of which involved launching a bunch of new pages. Shortly afterwards we saw a significant drop in organic traffic. Some of the new pages list similar content as previously existed on our site, but in different orders. So our question is, what's the best way to diagnose whether this was the cause of our ranking drop? My current thought is to block the new directories via robots.txt for a couple days and see if traffic improves. Is this a good approach? Any other suggestions?
Intermediate & Advanced SEO | | jamesti0