Best tools for identifying internal duplicate content
-
Hello again Mozzers! Other than the Moz tool, are there any other tools out there for identifying internal duplicate content? Thanks, Luke
-
-
Great article link! Thank you!
-
Thanks Jorge - Not sure how I'd survive without Screaming Frog - haven't gotten around to Xenu Linksleuth yet, but must give it a go sometime soon! Although I use copyscape to check for external duplication, hadn't realised I could use it to check for duplicate text within a website, so I'm v grateful for that pointer
Luke
-
Thanks James - good advice!
-
Huge thanks for the advice and that brilliant article Anthony :-)!
-
Luke
Apart from the tools mentioned above, I use copyscape premium to identify duplicate text (in the body of the page), I also find these tools very useful:
Xenu Linksleuth: very good for finding duplicate tags in your page's headers (title, description), and for many other tasks that require crawling your site. And the tool is free!
Screaming Frog: Another web crawler and very good tool for finding duplicate tags. It is a paid tool (about 77 GBP per year) but has a couple of features that Xenu does not have.
Cheers
Jorge
-
I use the Moz crawler to crawl my entire site and export it to an excel spreadsheet to navigate, its one of the first columns on your report
http://pro.moz.com/tools/crawl-test
Although i agree with Anthony and think its a very good idea to track any duplicate mentions from Google's perspective in webmaster tools
-
Duplicate content is going to be on your website. The key is to keep it out of Google's index. That is why using Google tools is very important. I find that using Google Webmaster Tools (duplicate page titles) and most importantly Google search as the best way to identify these problems.
This article is absolutely fantastic and there is a section titled "Tools for Finding & Diagnosing Duplicate Content" that explains exactly how to use Google and Google Webmaster Tools to find your duplicate content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Questions about duplicate photo content?
I know that Google is a mystery, so I am not sure if there are answers to these questions, but I'm going to ask anyway! I recently realized that Google is not happy with duplicate photo content. I'm a photographer and have sold many photos in the past (but retained the rights for) that I am now using on my site. My recent revelations means that I'm now taking down all of these photos. So I've been reverse image searching all of my photos to see if I let anyone else use it first, and in the course of this I found out that there are many of my photos being used by other sites on the web. So my questions are: With photos that I used first and others have stolen, If I edit these photos (to add copyright info) and then re-upload them, will the sites that are using these images then get credit for using the original image first? If I have a photo on another one of my own sites and I take it down, can I safely use that photo on my main site, or will Google retain the knowledge that it's been used somewhere else first? If I sold a photo and it's being used on another site, can I safely use a different photo from the same series that is almost exactly the same? I am unclear what data from the photo Google is matching, and if they can tell the difference between photos that were taken a few seconds apart.
Intermediate & Advanced SEO | | Lina5000 -
Help with Best Content Posting Approach - WordPress site
I have a word document that i would like to add to my wordpress site as a page. The document has a large detailed flow chart of a complex legal process. (about 20+ boxes in the flow chart). I do not want to add it as an image because i want search engines to read/index the information in the flow chart. any suggestions to post this detailed flow chart on a WP page in the best SEO manner? Thanks.
Intermediate & Advanced SEO | | CamiloSC0 -
Duplicate Content: Organic vs Local SEO
Does Google treat them differently? I found something interesting just now and decided to post it up http://www.daviddischler.com/is-duplicate-content-treated-differently-when-local-seo-comes-into-play/
Intermediate & Advanced SEO | | daviddischler0 -
Artist Bios on Multiple Pages: Duplicate Content or not?
I am currently working on an eComm site for a company that sells art prints. On each print's page, there is a bio about the artist followed by a couple of paragraphs about the print. My concern is that some artists have hundreds of prints on this site, and the bio is reprinted on every page,which makes sense from a usability standpoint, but I am concerned that it will trigger a duplicate content penalty from Google. Some people are trying to convince me that Google won't penalize for this content, since the intent is not to game the SERPs. However, I'm not confident that this isn't being penalized already, or that it won't be in the near future. Because it is just a section of text that is duplicated, but the rest of the text on each page is original, I can't use the rel=canonical tag. I've thought about putting each artist bio into a graphic, but that is a huge undertaking, and not the most elegant solution. Could I put the bio on a separate page with only the artist's info and then place that data on each print page using an <iframe>and then put a noindex,nofollow in the robots.txt file?</p> <p>Is there a better solution? Is this effort even necessary?</p> <p>Thoughts?</p></iframe>
Intermediate & Advanced SEO | | sbaylor0 -
Duplicate content throughout multiple URLs dilemma
We have a website with lots of categories and there are problems that some subcategories have identical content on them. So, is it enough to just add different text on those problematic subcategories or we need to use "canonical" tag to main category. Same dilemma is with our search system and duplicate content. For example, "/category/sports" URL would have similar to identical content with "/search/sports" and "/search/sports-fitness/" URLs. Ranking factors is important for all different categories and subcategories. Ranking factors is also important for search individual keywords. So, the question is, how to make them somehow unique/different to rank on all those pages well? Would love to hear advices how it can be solved using different methods and how it would affect our rankings. When we actually need to use "canonical" tag and when 301 redirect is better. Thanks!
Intermediate & Advanced SEO | | versliukai0 -
Can videos be considered duplicate content?
I have a page that ranks 5 and to get a rich snippet I'm thinking of adding a relevant video to the page. Thing is, the video is already on another page which ranks for this keyword... but only at position 20. As it happens the page the video is on is the more important page for other keywords, so I won't remove it. Will having the same video on two pages be considered a duplicate?
Intermediate & Advanced SEO | | Brocberry0 -
How do I fix the error duplicate page content and duplicate page title?
On my site www.millsheating.co.uk I have the error message as per the question title. The conflict is coming from these two pages which are effectively the same page: www.millsheating.co.uk www.millsheating.co.uk/index I have added a htaccess file to the root folder as I thought (hoped) it would fix the problem but I doesn't appear to have done so. this is the content of the htaccess file: Options +FollowSymLinks RewriteEngine On RewriteCond %{HTTP_HOST} ^millsheating.co.uk RewriteRule (.*) http://www.millsheating.co.uk/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/ RewriteRule ^index\.html$ http://www.millsheating.co.uk/ [R=301,L] AddType x-mapp-php5 .php
Intermediate & Advanced SEO | | JasonHegarty0 -
Should I robots block site directories with primarily duplicate content?
Our site, CareerBliss.com, primarily offers unique content in the form of company reviews and exclusive salary information. As a means of driving revenue, we also have a lot of job listings in ouir /jobs/ directory, as well as educational resources (/career-tools/education/) in our. The bulk of this information are feeds, which exist on other websites (duplicate). Does it make sense to go ahead and robots block these portions of our site? My thinking is in doing so, it will help reallocate our site authority helping the /salary/ and /company-reviews/ pages rank higher, and this is where most of the people are finding our site via search anyways. ie. http://www.careerbliss.com/jobs/cisco-systems-jobs-812156/ http://www.careerbliss.com/jobs/jobs-near-you/?l=irvine%2c+ca&landing=true http://www.careerbliss.com/career-tools/education/education-teaching-category-5/
Intermediate & Advanced SEO | | CareerBliss0