Finding Duplicate Content Spanning more than one Site?
-
Hi forum, SEOMoz's crawler identifies duplicate content within your own site, which is great. How can I compare my site to another site to see if they share "duplicate content?" Thanks!
-
The Alert thing is great! I use it when we write new content (along with CopyScape after a week or so) just so I can make sure I'm outranking it. lol
-
Yes. I totally agree with Darin. There isn't a duplicate content penalty, per se, and the tools he listed are quite good suggestions as well.
-
IMHO, even if the HTML is different you could have duplicate content if the H1 or paragraph text is substantially similar. However, is this automatically penalized? No. Syndication of content can be quite prevalent on the Web. For example the AP breaks a news story and posts it online and it is subsequently picked up by the New York Times and Wall Street Journal. Wherever the content appeared first, particularly if it has a canonical tag in place, that source will be credited with having the original content. The other sites aren't going to be penalized, but they aren't going to benefit from it either.
Similar things happen on large e-commerce sites all the time. For example, 100's of e-commerce stores sell lightbulbs. Those descriptions are most certainly "substantially similar." It'd be kind of strange if they weren't. They aren't penalized for that.
I hope this is helpful! It is always good to set up a Google Alert for any great pieces of content you do write, just so you can be aware of who might be copying your stuff! (Tynt.com can also be very useful for this).
Good luck!
Dana
-
Just for the record there isn't any "Duplicate Content Penalty" so don't worry to much about this. Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results.
However, to answer your question I use copyscape to do this but you have to insert a URL and not just lines at a time.
Here are some other ones I've heard good things about:
I agree with Dana on the Google thing too. Like she said, "Just be sure to put quotes around your snippet."
-
This helps, thanks Dana. Is the actual paragraph content the main source of a duplicate content penalty? For example, what if the pages share different metadata and the HTML is entirely different except for the H1 text and paragraph content?
-
Hi Zora,
This best way to do this is to grab a random section of text from the page and go to Google, then paste that section of text in the search bar inside "quotes." For example, from your question above, I could search:
"SEOMoz's crawler identifies duplicate content within your own site, which is great. How can I compare my site"
you will see that the result in Google is a result to this page (once it's been indexed, which hasn't happened quite yet) - Just be sure to put quotes around your snippet.
Hope that helps!
Dana
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
New site migration (multiple sites into one + new domain)
Hi, I have read so many very helpful guides and experiences from you guys that will greatly help me but I have a few questions please. Our company has 3 sites, the main site and 2 sites for different product ranges: BrandProductName.com (main site - DA = 22 raking well for product name) Productname2.com (DA = 10 ranking very well for product name and little competition) BrandProductName3.com (DA = 10 poor ranking) We wish to bring all the sites into one with categories for the 3 different product. The main site is an e-commerece site whereas the other 2 are not (currently). On top of this as the main domain has one of the product names in it they wish to change the domain to be just Brandname.com. So the plan is to combine site 2 and 3 into site 1 and change that domain name. As you can imagine this is going to be quite a job. I am fairly happy with the steps required (having read all the guides and migrated many sites in the past) but with the added domain name change this is a little daunting. So my questions are: Should I merge the 3 sites into 1 and then changed the domain at a later point? Should I change the domain of the main site first and then merge site 2 and 3 in later? Should I just do it all together? Or based on the data i have provided do you disagree with the plan, what would you recommend? We are not in a massive rush to complete all of this so we have the time to plan and execute this when we are fully ready. Any help / advise would be greatly appreciated. Thanks all
Intermediate & Advanced SEO | | csimmo0 -
Duplicate Content with URL Parameters
Moz is picking up a large quantity of duplicate content, consists mainly of URL parameters like ,pricehigh & ,pricelow etc (for page sorting). Google has indexed a large number of the pages (not sure how many), not sure how many of them are ranking for search terms we need. I have added the parameters into Google Webmaster tools And set to 'let google decide', However Google still sees it as duplicate content. Is it a problem that we need to address? Or could it do more harm than good in trying to fix it? Has anyone had any experience? Thanks
Intermediate & Advanced SEO | | seoman100 -
Trying to advise on what seems to be a duplicate content penalty
So a friend of a friend was referred to me a few weeks ago as his Google traffic fell off a cliff. I told him I'd take a look at it and see what I could find and here's the situation I encountered. I'm a bit stumped at this point, so I figured I'd toss this out to the Moz crowd and see if anyone sees something I'm missing. The site in question is www.finishlinewheels.com In Mid June looking at the site's webmaster tools impressions went from around 20,000 per day down to 1,000. Interestingly, some of their major historic keywords like "stock rims" had basically disappeared while some secondary keywords hadn't budged. The owner submitted a reconsideration request and was told he hadn't received a manual penalty. I figured it was the result of either an automated filter/penalty from bad links, the result of a horribly slow server or possibly a duplicate content issue. I ran the backlinks on OSE, Majestic and pulled the links from Webmaster Tools. While there aren't a lot of spectacular links there also doesn't seem to be anything that stands out as terribly dangerous. Lots of links from automotive forums and the like - low authority and such, but in the grand scheme of things their links seem relevant and reasonable. I checked the site's speed in analytics and WMT as well as some external tools and everything checked out as plenty fast enough. So that wasn't the issue either. I tossed the home page into copyscape and I found the site brandwheelsandtires.com - which had completely ripped the site - it was thousands of the same pages with every element copied, including the phone number and contact info. Furthering my suspicions was after looking at the Internet Archive the first appearance was mid-May, shortly before his site took the nose dive (still visible at http://web.archive.org/web/20130517041513/http://brandwheelsandtires.com) THIS, i figured was the problem. Particularly when I started doing exact match searches for text on the finishlinewheels.com home page like "welcome to finish line wheels" and it was nowhere to be found. I figured the site had to be sandboxed. I contacted the owner and asked if this was his and he said it wasn't. So I gave him the contact info and he contacted the site owner and told them it had to come down and the owner apparently complied because it was gone the next day. He also filed a DMCA complaint with Google and they responded after the site was gone and said they didn't see the site in question (seriously, the guys at Google don't know how to look at their own cache?). I then had the site owner send them a list of cached URLs for this site and since then Google has said nothing. I figure at this point it's just a matter of Google running it's course. I suggested he revise the home page content and build some new quality links but I'm still a little stumped as to how/why this happened. If it was seen as duplicate content, how did this site with no links and zero authority manage to knock out a site that ranked well for hundreds of terms that had been around for 7 years? I get that it doesn't have a ton of authority but this other site had none. I'm doing this pro bono at this point but I feel bad for this guy as he's losing a lot of money at the moment so any other eyeballs that see something that I don't would be very welcome. Thanks Mozzers!
Intermediate & Advanced SEO | | NetvantageMarketing2 -
How do I geo-target continents & avoid duplicate content?
Hi everyone, We have a website which will have content tailored for a few locations: USA: www.site.com
Intermediate & Advanced SEO | | AxialDev
Europe EN: www.site.com/eu
Canada FR: www.site.com/fr-ca Link hreflang and the GWT option are designed for countries. I expect a fair amount of duplicate content; the only differences will be in product selection and prices. What are my options to tell Google that it should serve www.site.com/eu in Europe instead of www.site.com? We are not targeting a particular country on that continent. Thanks!0 -
Best-of-the-web content in steep competition, ecommerce site
Hello, I'm helping my client write a long, comprehensive, best-of-the-web piece of content. It's a boring ecommerce niche, but on the informational side the top 10 competitors for the most linked to topic are all big players with huge domain authority. There's not a lot of links in the industry, should I try to top all the big industries through better content (somehow), pictures, illustrations, slideshows with audio, and by being more thorough than these very good competitors? Or should I go for something that's less linked to (maybe 1/5 as much people linking to it) but easier? or both? We're on a short timeline of 3 and 1/2 months until we need traffic and our budget is not huge
Intermediate & Advanced SEO | | BobGW1 -
Wordpress Duplicate Content
We have recently moved our company's blog to Wordpress on a subdomain (we utilize the Yoast SEO plugin). We are now experiencing an ever-growing volume of crawl errors (nearly 300 4xx now) for pages that do not exist to begin with. I believe it may have something to do with having the blog on a subdomain and/or our yoast seo plugin's indexation archives (author, category, etc) --- we currently have Subpages of archives and taxonomies, and category archives in use. I'm not as familiar with Wordpress and the Yoast SEO plugin as I am with other CMS' so any help in this matter would be greatly appreciated. I can PM further info if necessary. Thank you for the help in advance.
Intermediate & Advanced SEO | | BethA0 -
Is this will post Duplicated Content
I have domain let say abcshoesonlinestore.com and inside pages of this abcshoesonlinestore.com is ranking very well such as affiliate page, knowledgebase page and other pages, HOWEVER i would like to change my home page and product page to shorter url which abcshoes.com and keep those inside page like www.abashoesonlinestore.com/affiliate or www.abcshoesonlinestore.com/knowledgebase as it is - will this pose duplicate content? This is my plan to do it: the home page and product page will be www.abcshoes.com and when people click www.abcshoes.com/affiliate it will redirect 301 to abcshoesonlinestore.com/affiliate HOWEVER if someone type abcshoesonlinestore.com or abcshoesonlinestore.com/product it will redirect to abcshoes.com or its product page itself (i want to use 302 instead 301 (ASSUMING if the homapage or product page have manual penalization or anything bad we want to leave it behind and start fresh JUST assume because i read some post that 301 will carry any bad thing to new site too) The reason i do not want to 301 from abcshoesonlinestore.com to abcshoes.com is because those many pages is ranking top 3 in GOOGLE ( i worry will lose this ranking since this bringing traffic for us) Is this good idea or bad idea or any better idea or should i try to see the outcome 🙂 - the only concern is from abcshoesonlinestore.com to abcshoes.com will pose as duplicate content if i do not use 301 - or can i use google webmaster tools to remove the home page and product page for abcshoesonlinestore.com can we tell google that? PS: (home page and product page will have new revise content and minor design change) but inside page will stay the same design Please give me some advise
Intermediate & Advanced SEO | | owen20110 -
Duplicate Content
Hi everyone, I have a TLD in the UK with a .co.uk and also the same site in Ireland (.ie). The only differences are the prices and different banners maybe. The .ie site pulls all of the content from the .co.uk domain. Is this classed as content duplication? I've had problems in the past in which Google struggles to index the website. At the moment the site appears completely fine in the UK SERPs but for Ireland I just have the Title and domain appearing in the SERPs, with no extended title or description because of the confusion I caused Google last time. Does anybody know a fix for this? Thanks
Intermediate & Advanced SEO | | royb0