Best tools for identifying internal duplicate content
-
Hello again Mozzers! Other than the Moz tool, are there any other tools out there for identifying internal duplicate content? Thanks, Luke
-
-
Great article link! Thank you!
-
Thanks Jorge - Not sure how I'd survive without Screaming Frog - haven't gotten around to Xenu Linksleuth yet, but must give it a go sometime soon! Although I use copyscape to check for external duplication, hadn't realised I could use it to check for duplicate text within a website, so I'm v grateful for that pointer Luke
-
Thanks James - good advice!
-
Huge thanks for the advice and that brilliant article Anthony :-)!
-
Luke
Apart from the tools mentioned above, I use copyscape premium to identify duplicate text (in the body of the page), I also find these tools very useful:
Xenu Linksleuth: very good for finding duplicate tags in your page's headers (title, description), and for many other tasks that require crawling your site. And the tool is free!
Screaming Frog: Another web crawler and very good tool for finding duplicate tags. It is a paid tool (about 77 GBP per year) but has a couple of features that Xenu does not have.
Cheers
Jorge
-
I use the Moz crawler to crawl my entire site and export it to an excel spreadsheet to navigate, its one of the first columns on your report
http://pro.moz.com/tools/crawl-test
Although i agree with Anthony and think its a very good idea to track any duplicate mentions from Google's perspective in webmaster tools
-
Duplicate content is going to be on your website. The key is to keep it out of Google's index. That is why using Google tools is very important. I find that using Google Webmaster Tools (duplicate page titles) and most importantly Google search as the best way to identify these problems.
This article is absolutely fantastic and there is a section titled "Tools for Finding & Diagnosing Duplicate Content" that explains exactly how to use Google and Google Webmaster Tools to find your duplicate content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Product descriptions & Duplicate Content: between fears and reality
Hello everybody, I've been reading quite a lot recently about this topic and I would like to have your opinion about the following conclusion: ecommerce websites should have their own product descriptions if they can manage it (it will be beneficial for their SERPs rankings) but the ones who cannot won't be penalized by having the same product descriptions (or part of the same descriptions) IF it is only a "small" part of their content (user reviews, similar products, etc). What I mean is that among the signals that Google uses to guess which sites should be penalized or not, there is the ratio "quantity of duplicate content VS quantity of content in the page" : having 5-10 % of a page text corresponding to duplicate content might not be harmed while a page which has 50-75 % of a content page duplicated from an other site... what do you think? Can the "internal" duplicated content (for example 3 pages about the same product which is having 3 diferent colors -> 1 page per product color) be considered as "bad" as the "external" duplicated content (same product description on diferent sites) ? Thanks in advance for your opinions!
Intermediate & Advanced SEO | | Kuantokusta0 -
If it's not in Webmaster Tools, is it Duplicate Title
I am showing a lot of errors in my SEOmoz reports for duplicate content and duplicate titles, many of which appear to be related to capitalization vs non-capitalization in the URL. Case in point, if a URL contains a lower character, such as: http://www.gallerydirect.com/art/product/allyson-krowitz/distinct-microstructure-i as opposed to the same URL having an upper character in the structure: http://www.gallerydirect.com/art/product/allyson-krowitz/distinct-microstructure-I I am finding that some of the internal links on the site use the former structure and other links use the latter structure. These show as duplicate title/content in the SEOmoz reports, but they don't appear as duplicate titles in Webmaster Tools. My question is, should I try to work with our developers to create a script to change all of the content with cap letters in the destination links internally on the site, or is this a non-issue since it doesn't appear in Webmaster Tools?
Intermediate & Advanced SEO | | sbaylor0 -
Proper Hosting Setup to Avoid Subfolders & Duplicate Content
I've noticed with hosting multiple websites on a single account you end up having your main site in the root public_html folder, but when you create subfolders for new website it actually creates a duplicate website: eg. http://kohnmeat.com/ is being hosted on laubeau.com's server. So you end up with a duplicate website: http://laubeau.com/kohn/ Anyone know the best way to prevent this from happening? (i.e. canonical? 301? robots.txt?) Also, maybe a specific 'how-to' if you're feeling generous 🙂
Intermediate & Advanced SEO | | ATMOSMarketing560 -
How to Remove Joomla Canonical and Duplicate Page Content
I've attempted to follow advice from the Q&A section. Currently on the site www.cherrycreekspine.com, I've edited the .htaccess file to help with 301s - all pages redirect to www.cherrycreekspine.com. Secondly, I'd added the canonical statement in the header of the web pages. I have cut the Duplicate Page Content in half ... now I have a remaining 40 pages to fix up. This is my practice site to try and understand what SEOmoz can do for me. I've looked at some of your videos on Youtube ... I feel like I'm scrambling around to the Q&A and the internet to understand this product. I'm reading the beginners guide.... any other resources would be helpful.
Intermediate & Advanced SEO | | deskstudio0 -
Duplicate Content in News Section
Our clients site is in the hunting niche. According to webmaster tools there are over 32,000 indexed pages. In the new section that are 300-400 news posts where over the course of a about 5 years they manually copied relevant Press Releases from different state natural resources websites (ex. http://gfp.sd.gov/news/default.aspx). This content is relevant to the site visitors but it is not unique. We have since begun posting unique new posts but I am wondering if anything should be done with these old news posts that aren't unique? Should I use the rel="canonical tag or noindex tag for each of these pages? Or do you have another suggestion?
Intermediate & Advanced SEO | | rise10 -
Duplicate content, website authority and affiliates
We've got a dilemma at the moment with the content we supply to an affiliate. We currently supply the affiliate with our product database which includes everything about a product including the price, title, description and images. The affiliate then lists the products on their website and provides a Commission Junction link back to our ecommerce store which tracks any purchases with the affiliate getting a commission based on any sales via a cookie. This has been very successful for us in terms of sales but we've noticed a significant dip over the past year in ranking whilst the affiliate has achieved a peak...all eyes are pointing towards the Panda update. Whenever I type one of our 'uniquely written' product descriptions into Google, the affiliate website appears higher than ours suggesting Google has ranked them the authority. My question is, without writing unique content for the affiliate and changing the commission junction link. What would be the best option to be recognised as the authority of the content which we wrote in the first place? It always appears on our website first but Google seems to position the affiliate higher than us in the SERPS after a few weeks. The commission junction link is written like this: http://www.anrdoezrs.net/click-1428744-10475505?sid=shopp&url=http://www.outdoormegastore.co.uk/vango-calisto-600xl-tent.html
Intermediate & Advanced SEO | | gavinhoman0 -
Wich is the best way to manage dup content in a intenational Portal?
We have a portal wich is only in spain and we started to internazionalized it to Argentina, Mexico and Colombia. Before we had a .com domain with content only for spain and now that domain is going to be global. so.. .com contains all the content and you can filter for country .es contains spanish content .com.ar contanis argenitian content Every thing is ok but the problem is that there is a content (online courses) that is in every country. What we thougt to do is: -online contect url canonical to .com domain -Geo content url canonical to .es, .com.ar domain (depending on the geo) Filters besidese .com and .es can give similar resoults we do not use canonical url or we will follow the rule above (if there is geo in .com filter then canonical to geo domain and if the filter is (online courses) then canonical to .com domain) What do you think about that? Thank you in advance.
Intermediate & Advanced SEO | | ofuente0 -
Duplicate content ramifications for country TLDs
We have a .com site here in the US that is ranking well for targeted phrases. The client is expanding its sales force into India and South Africa. They want to duplicate the site entirely, twice. Once for each country. I'm not well-versed in international SEO. Will this cause a duplicate content filter? Would google.co.in and google.co.za look at google.com's index for duplication? Thanks. Long time lurker, first time question poster.
Intermediate & Advanced SEO | | Alter_Imaging0