Best tools for identifying internal duplicate content
-
Hello again Mozzers! Other than the Moz tool, are there any other tools out there for identifying internal duplicate content? Thanks, Luke
-
-
Great article link! Thank you!
-
Thanks Jorge - Not sure how I'd survive without Screaming Frog - haven't gotten around to Xenu Linksleuth yet, but must give it a go sometime soon! Although I use copyscape to check for external duplication, hadn't realised I could use it to check for duplicate text within a website, so I'm v grateful for that pointer Luke
-
Thanks James - good advice!
-
Huge thanks for the advice and that brilliant article Anthony :-)!
-
Luke
Apart from the tools mentioned above, I use copyscape premium to identify duplicate text (in the body of the page), I also find these tools very useful:
Xenu Linksleuth: very good for finding duplicate tags in your page's headers (title, description), and for many other tasks that require crawling your site. And the tool is free!
Screaming Frog: Another web crawler and very good tool for finding duplicate tags. It is a paid tool (about 77 GBP per year) but has a couple of features that Xenu does not have.
Cheers
Jorge
-
I use the Moz crawler to crawl my entire site and export it to an excel spreadsheet to navigate, its one of the first columns on your report
http://pro.moz.com/tools/crawl-test
Although i agree with Anthony and think its a very good idea to track any duplicate mentions from Google's perspective in webmaster tools
-
Duplicate content is going to be on your website. The key is to keep it out of Google's index. That is why using Google tools is very important. I find that using Google Webmaster Tools (duplicate page titles) and most importantly Google search as the best way to identify these problems.
This article is absolutely fantastic and there is a section titled "Tools for Finding & Diagnosing Duplicate Content" that explains exactly how to use Google and Google Webmaster Tools to find your duplicate content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
International SEO and duplicate content: what should I do when hreflangs are not enough?
Hi, A follow up question from another one I had a couple of months ago: It has been almost 2 months now that my hreflangs are in place. Google recognises them well and GSC is cleaned (no hreflang errors). Though I've seen some positive changes, I'm quite far from sorting that duplicate content issue completely and some entire sub-folders remain hidden from the SERP.
Intermediate & Advanced SEO | | GhillC
I believe it happens for two reasons: 1. Fully mirrored content - as per the link to my previous question above, some parts of the site I'm working on are 100% similar. Quite a "gravity issue" here as there is nothing I can do to fix the site architecture nor to get bespoke content in place. 2. Sub-folders "authority". I'm guessing that Google prefers sub-folders over others due to their legacy traffic/history. Meaning that even with hreflangs in place, the older sub-folder would rank over the right one because Google believes it provides better results to its users. Two questions from these reasons:
1. Is the latter correct? Am I guessing correctly re "sub-folders" authority (if such thing exists) or am I simply wrong? 2. Can I solve this using canonical tags?
Instead of trying to fix and "promote" hidden sub-folders, I'm thinking to actually reinforce the results I'm getting from stronger sub-folders.
I.e: if a user based in belgium is Googling something relating to my site, the site.com/fr/ subfolder shows up instead of the site.com/be/fr/ sub-sub-folder.
Or if someone is based in Belgium using Dutch, he would get site.com/nl/ results instead of the site.com/be/nl/ sub-sub-folder. Therefore, I could canonicalise /be/fr/ to /fr/ and do something similar for that second one. I'd prefer traffic coming to the right part of the site for tracking and analytic reasons. However, instead of trying to move mountain by changing Google's behaviour (if ever I could do this?), I'm thinking to encourage the current flow (also because it's not completely wrong as it brings traffic to pages featuring the correct language no matter what). That second question is the main reason why I'm looking out for MoZ's community advice: am I going to damage the site badly by using canonical tags that way? Thank you so much!
G0 -
Same content, different languages. Duplicate content issue? | international SEO
Hi, If the "content" is the same, but is written in different languages, will Google see the articles as duplicate content?
Intermediate & Advanced SEO | | chalet
If google won't see it as duplicate content. What is the profit of implementing the alternate lang tag?Kind regards,Jeroen0 -
Duplicate Page Content Issues Reported in Moz Crawl Report
Hi all, We have a lot of 'Duplicate Page Content' issues being reported on the Moz Crawl Report and I am trying to 'get to the bottom' of why they are deemed as errors... This page; http://www.bolsovercruiseclub.com/about-us/job-opportunities/ has (admittedly) very little content and is duplicated with; http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/explorer-of-the-seas-2015/ This page is basically an image and has just a couple of lines of static content. Also duplicated with; http://www.bolsovercruiseclub.com/cruise-lines/costa-cruises/costa-voyager/ This page relates to a single cruise ship and again has minimal content... Also duplicated with; http://www.bolsovercruiseclub.com/faq/packing/ This is an FAQ page again with only a few lines of content... Also duplicated with; http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/exclusive-canada-&-alaska-cruisetour/ Another page that just features an image and NO content... Also duplicated with; http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/free-upgrades-on-cunard-2014-&-2015/?page_number=6 A cruise deals page that has a little bit of static content and a lot of dynamic content (which I suspect isn't crawled) So my question is, is the duplicate content issued caused by the fact that each page has 'thin' or no content? If that is the case then I assume the simple fix is to increase add \ increase the content? I realise that I may have answered my own question but my brain is 'pickled' at the moment and so I guess I am just seeking assurances! 🙂 Thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
Duplicate content across internation urls
We have a large site with 1,000+ pages of content to launch in the UK. Much of this content is already being used on a .nz url which is going to stay. Do you see this as an issue or do you thin Google will take localised factoring into consideration. We could add a link from the NZ pages to the UK. We cant noindex the pages as this is not an option. Thanks
Intermediate & Advanced SEO | | jazavide0 -
Copying my Facebook content to website considered duplicate content?
I write career advice on Facebook on a daily basis. On my homepage users can see the most recent 4-5 feeds (using FB social media plugin). I am thinking to create a page on my website where visitors can see all my previous FB feeds. Would this be considered duplicate content if I copy paste the info, but if I use a Facebook social media plugin then it is not considered duplicate content? I am working on increasing content on my website and feel incorporating FB feeds would make sense. thank you
Intermediate & Advanced SEO | | knielsen0 -
404 for duplicate content?
Sorry, I think this is my third question today... But I have a lot of duplicated content on my site. I use joomla so theres a lot of unintentional duplication. For example, www.mysite.com/index.php exists, etc. Up till now, I thought I had to 301 redirect or rel=canonical these "duplicated pages." However, can I just 404 it? Is there anything wrong with this rpactice in regards to SEO?
Intermediate & Advanced SEO | | waltergah0 -
How should i best structure my internal links?
I am new to SEO and looking to employ a logical but effective internal link strategy. Any easy ways to keep track of what page links to what page? I am a little confused regarding anchor text in as much as how I should use this. e.g. for a category page "Towels", I was going to link this to another page we want to build PA for such as "Bath Sheets". What should I put in for anchor text? keep it simple and just put "Bath Sheets" or make it more direct like "Buy Bath Sheets". Should I also vary anchor text if i have another 10 pages internally linking to this or keep it the same. Any advise would be really helpful. Thanks Craig
Intermediate & Advanced SEO | | Towelsrus0 -
Duplicate Content | eBay
My client is generating templates for his eBay template based on content he has on his eCommerce platform. I'm 100% sure this will cause duplicate content issues. My question is this.. and I'm not sure where eBay policy stands with this but adding the canonical tag to the template.. will this work if it's coming from a different page i.e. eBay? Update: I'm not finding any information regarding this on the eBay policy's: http://ocs.ebay.com/ws/eBayISAPI.dll?CustomerSupport&action=0&searchstring=canonical So it does look like I can have rel="canonical" tag in custom eBay templates but I'm concern this can be considered: "cheating" since rel="canonical is actually a 301 but as this says: http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html it's legitimately duplicate content. The question is now: should I add it or not? UPDATE seems eBay templates are embedded in a iframe but the snap shot on google actually shows the template. This makes me wonder how they are handling iframes now. looking at http://www.webmaster-toolkit.com/search-engine-simulator.shtml does shows the content inside the iframe. Interesting. Anyone else have feedback?
Intermediate & Advanced SEO | | joseph.chambers1