Finding Duplicate Content Spanning more than one Site?
-
Hi forum, SEOMoz's crawler identifies duplicate content within your own site, which is great. How can I compare my site to another site to see if they share "duplicate content?" Thanks!
-
The Alert thing is great! I use it when we write new content (along with CopyScape after a week or so) just so I can make sure I'm outranking it. lol
-
Yes. I totally agree with Darin. There isn't a duplicate content penalty, per se, and the tools he listed are quite good suggestions as well.
-
IMHO, even if the HTML is different you could have duplicate content if the H1 or paragraph text is substantially similar. However, is this automatically penalized? No. Syndication of content can be quite prevalent on the Web. For example the AP breaks a news story and posts it online and it is subsequently picked up by the New York Times and Wall Street Journal. Wherever the content appeared first, particularly if it has a canonical tag in place, that source will be credited with having the original content. The other sites aren't going to be penalized, but they aren't going to benefit from it either.
Similar things happen on large e-commerce sites all the time. For example, 100's of e-commerce stores sell lightbulbs. Those descriptions are most certainly "substantially similar." It'd be kind of strange if they weren't. They aren't penalized for that.
I hope this is helpful! It is always good to set up a Google Alert for any great pieces of content you do write, just so you can be aware of who might be copying your stuff! (Tynt.com can also be very useful for this).
Good luck!
Dana
-
Just for the record there isn't any "Duplicate Content Penalty" so don't worry to much about this. Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results.
However, to answer your question I use copyscape to do this but you have to insert a URL and not just lines at a time.
Here are some other ones I've heard good things about:
I agree with Dana on the Google thing too. Like she said, "Just be sure to put quotes around your snippet."
-
This helps, thanks Dana. Is the actual paragraph content the main source of a duplicate content penalty? For example, what if the pages share different metadata and the HTML is entirely different except for the H1 text and paragraph content?
-
Hi Zora,
This best way to do this is to grab a random section of text from the page and go to Google, then paste that section of text in the search bar inside "quotes." For example, from your question above, I could search:
"SEOMoz's crawler identifies duplicate content within your own site, which is great. How can I compare my site"
you will see that the result in Google is a result to this page (once it's been indexed, which hasn't happened quite yet) - Just be sure to put quotes around your snippet.
Hope that helps!
Dana
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do you think this case would be of a duplicated content and what would be the consequences in such case?
At the webpage https://authland.com/ which is a food&wine tours and activities booking platform, primary content - services thumbnails containing information about the destination, title and prices of the particular services, can be found at several sub-pages/urls. For example, service https://authland.com/zadar/zadar-region-food-and-wine-tour/1/. Its thumbnail/card through which the service is available, can be found on multiple pages (Categories, Destinations, All services, Most recent services...) Is this considered a duplicated content? Since all of the thumbnails for services on the platform, are to be found on multiple pages. If it is, which would be the best way to avoid that content being perceived by Google bots as such? Thank you very much!
Intermediate & Advanced SEO | | ZD20200 -
Duplicate content with URLs
Hi all, Do you think that is possible to have duplicate content issues because we provide a unique image with 5 different URLs ? In the HTML code pages, just one URL is provide. It's enough for that Google don't see the other URLs or not ? Example, in this article : http://www.parismatch.com/People/Kim-Kardashian-sa-securite-n-a-pas-de-prix-1092112 The same image is available on: http://cdn-parismatch.ladmedia.fr/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize1-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize2-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize3-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg Thank you very much for your help. Julien
Intermediate & Advanced SEO | | Julien.Ferras0 -
How to handle duplicate content with Bible verses
Have a friend that does a site with bible verses and different peoples thoughts or feelings on them. Since I'm an SEO he came to me with questions and duplicate content red flag popped up in my head. My clients all generate their own content so not familiar with this world. Since Bible verses appear all over the place, is there a way to address this from an SEO standpoint to avoid duplicate content issues? Thanks in advance.
Intermediate & Advanced SEO | | jeremyskillings0 -
Does duplicate content penalize the whole site or just the pages affected?
I am trying to assess the impact of duplicate content on our e-commerce site and I need to know if the duplicate content is affecting only the pages that contain the dupe content or does it affect the whole site? In Google that is. But of course. Lol
Intermediate & Advanced SEO | | bjs20100 -
Google WMT Showing Duplicate Content, But There is None
In the HTML improvements section of Google Webmaster Tools, it is showing duplicate content and I have verified that the duplicate content they are listing does not exist. I actually have another duplicate content issue I am baffled by, but that it already being discussed on another thread. These are the pages they are saying have duplicate META descriptions, http://www.hanneganremodeling.com/bathroom-remodeling.html (META from bathroom remodeling page) <meta name="<a class="attribute-value">description</a>" content="<a class="attribute-value">Bathroom Remodeling Washington DC, Bathroom Renovation Washington DC, Bath Remodel, Northern Virginia,DC, VA, Washington, Fairfax, Arlington, Virginia</a>" /> http://www.hanneganremodeling.com/estimate-request.html (META From estimate page) <meta name="<a class="attribute-value">description</a>" content="<a class="attribute-value">Free estimates basement remodeling, bathroom remodeling, home additions, renovations estimates, Washington DC area</a>" /> WlO9TLh
Intermediate & Advanced SEO | | WebbyNabler0 -
Reinforcing Rel Canonical? (Fixing Duplicate Content)
Hi Mozzers, We're having trouble with duplicate content between two sites, so we're looking to add some oomph to the rel canonical link elements we put on one of our sites pointing towards the other to help speed up the process and give Google a bigger hint. Would adding a hyperlink on the "copying" website pointing towards the "original" website speed this process up? Would we get in trouble if added about 80,000 links (1 on each product page) with a link to the matching product on the other site? For example, we could use text like "Buy XY product on Other Brand Name and receive 10% off!"
Intermediate & Advanced SEO | | Travis-W0 -
Duplicate Content in News Section
Our clients site is in the hunting niche. According to webmaster tools there are over 32,000 indexed pages. In the new section that are 300-400 news posts where over the course of a about 5 years they manually copied relevant Press Releases from different state natural resources websites (ex. http://gfp.sd.gov/news/default.aspx). This content is relevant to the site visitors but it is not unique. We have since begun posting unique new posts but I am wondering if anything should be done with these old news posts that aren't unique? Should I use the rel="canonical tag or noindex tag for each of these pages? Or do you have another suggestion?
Intermediate & Advanced SEO | | rise10 -
HTTPS Duplicate Content?
I just recieved a error notification because our website is both http and https. http://www.quicklearn.com & https://www.quicklearn.com. My tech tells me that this isn't actually a problem? Is that true? If not, how can I address the duplicate content issue?
Intermediate & Advanced SEO | | QuickLearnTraining0