Best tools for identifying internal duplicate content
-
Hello again Mozzers! Other than the Moz tool, are there any other tools out there for identifying internal duplicate content? Thanks, Luke
-
-
Great article link! Thank you!
-
Thanks Jorge - Not sure how I'd survive without Screaming Frog - haven't gotten around to Xenu Linksleuth yet, but must give it a go sometime soon! Although I use copyscape to check for external duplication, hadn't realised I could use it to check for duplicate text within a website, so I'm v grateful for that pointer Luke
-
Thanks James - good advice!
-
Huge thanks for the advice and that brilliant article Anthony :-)!
-
Luke
Apart from the tools mentioned above, I use copyscape premium to identify duplicate text (in the body of the page), I also find these tools very useful:
Xenu Linksleuth: very good for finding duplicate tags in your page's headers (title, description), and for many other tasks that require crawling your site. And the tool is free!
Screaming Frog: Another web crawler and very good tool for finding duplicate tags. It is a paid tool (about 77 GBP per year) but has a couple of features that Xenu does not have.
Cheers
Jorge
-
I use the Moz crawler to crawl my entire site and export it to an excel spreadsheet to navigate, its one of the first columns on your report
http://pro.moz.com/tools/crawl-test
Although i agree with Anthony and think its a very good idea to track any duplicate mentions from Google's perspective in webmaster tools
-
Duplicate content is going to be on your website. The key is to keep it out of Google's index. That is why using Google tools is very important. I find that using Google Webmaster Tools (duplicate page titles) and most importantly Google search as the best way to identify these problems.
This article is absolutely fantastic and there is a section titled "Tools for Finding & Diagnosing Duplicate Content" that explains exactly how to use Google and Google Webmaster Tools to find your duplicate content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Mixing up languages on the same page + possible duplicate content
I have a site in English hosted under .com with English info, and then different versions of the site under subdirectories (/de/, /es/, etc.) Due to budget constraints we have only managed to translate the most important info of our product pages for the local domains. We feel however that displaying (on a clearly identified tab) the detailed product info in English may be of use for many users that can actually understand English, and may help us get more conversions to have that info. The problem is that this detailed product info is already used on the equivalent English page as well. This basically means 2 things: We are mixing languages on pages We have around 50% of duplicate content of these pages What do you think that the SEO implications of this are? By the way, proper Meta Titles and Meta Descriptions as well as implementation of href lang tag are in place.
Intermediate & Advanced SEO | | lauraseo0 -
Could this be seen as duplicate content in Google's eyes?
Hi I'm an in-house SEO and we've recently seen Panda related traffic loss along with some of our main keywords slipping down the SERPs. Looking for possible Panda related issues I was wondering if the following could be seen as duplicate content. We've got some very similar holidays (travel company) on our website. While they are different I'm concerned it may be seen as creating content that is too similar: http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays/the-wildlife-and-beaches-of-kenya.aspx http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays/ultimate-kenya-wildlife-and-beaches.aspx http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays/wildlife-and-beach-family-safari.aspx They do all have unique text but as you can see from the titles, they are very similar (note from an SEO point of view the tabbed content is all within the same page at source level). At the top level of the holiday pages we have a filtered search:
Intermediate & Advanced SEO | | KateWaite
http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays.aspx These pages have a unique introduction but the content snippets being pulled into the boxes is drawn from each of the individual holiday pages. I'm just concerned that these could be introducing some duplicating issues. Any thoughts?0 -
Duplicate content for hotel websites - the usual nightmare? is there any solution other than producing unique content?
Hiya Mozzers I often work for hotels. A common scenario is the hotel / resort has worked with their Property Management System to distribute their booking availability around the web... to third party booking sites - with the inventory goes duplicate page descriptions sent to these "partner" websites. I was just checking duplication on a room description - 20 loads of duplicate descriptions for that page alone - there are 200 rooms - so I'm probably looking at 4,000 loads of duplicate content that need rewriting to prevent duplicate content penalties, which will cost a huge amount of money. Is there any other solution? Perhaps ask booking sites to block relevant pages from search engines?
Intermediate & Advanced SEO | | McTaggart0 -
3rd Party hosted whitepapers — bad idea? Duplicate content?
It is common the B2B world to have 3rd parties host your whitepapers for added exposure. Is this a bad practice from an SEO point of view? Is the expectation that the 3rd parties use rel=canonical tags? I doubt most of them do . . .
Intermediate & Advanced SEO | | BlueLinkERP0 -
Best way to remove duplicate content with categories?
I have duplicate content for all of the products I sell on my website due to categories and subcategories. Ex: http://www.shopgearinc.com/products/product/stockfeeder-af38.php http://www.shopgearinc.com/products/co-matic-power-feeders/stockfeeder-af38.php http://www.shopgearinc.com/products/co-matic-power-feeders/heavy-duty-feeders/stockfeeder-af38.php Above are 3 urls to the same title and content. I use a third party developer backend system so doing canonicalization seems difficult as I don't have full access. What is the best to get rid of this duplicate content. Can I do it through webmaster tools or should I pay the developer to do the canonicalization or a 301 redirect? Any suggestions? Thanks
Intermediate & Advanced SEO | | kysizzle60 -
News sites & Duplicate content
Hi SEOMoz I would like to know, in your opinion and according to 'industry' best practice, how do you get around duplicate content on a news site if all news sites buy their "news" from a central place in the world? Let me give you some more insight to what I am talking about. My client has a website that is purely focuses on news. Local news in one of the African Countries to be specific. Now, what we noticed the past few months is that the site is not ranking to it's full potential. We investigated, checked our keyword research, our site structure, interlinking, site speed, code to html ratio you name it we checked it. What we did pic up when looking at duplicate content is that the site is flagged by Google as duplicated, BUT so is most of the news sites because they all get their content from the same place. News get sold by big companies in the US (no I'm not from the US so cant say specifically where it is from) and they usually have disclaimers with these content pieces that you can't change the headline and story significantly, so we do have quite a few journalists that rewrites the news stories, they try and keep it as close to the original as possible but they still change it to fit our targeted audience - where my second point comes in. Even though the content has been duplicated, our site is more relevant to what our users are searching for than the bigger news related websites in the world because we do hyper local everything. news, jobs, property etc. All we need to do is get off this duplicate content issue, in general we rewrite the content completely to be unique if a site has duplication problems, but on a media site, im a little bit lost. Because I haven't had something like this before. Would like to hear some thoughts on this. Thanks,
Intermediate & Advanced SEO | | 360eight-SEO
Chris Captivate0 -
Is this will post Duplicated Content
I have domain let say abcshoesonlinestore.com and inside pages of this abcshoesonlinestore.com is ranking very well such as affiliate page, knowledgebase page and other pages, HOWEVER i would like to change my home page and product page to shorter url which abcshoes.com and keep those inside page like www.abashoesonlinestore.com/affiliate or www.abcshoesonlinestore.com/knowledgebase as it is - will this pose duplicate content? This is my plan to do it: the home page and product page will be www.abcshoes.com and when people click www.abcshoes.com/affiliate it will redirect 301 to abcshoesonlinestore.com/affiliate HOWEVER if someone type abcshoesonlinestore.com or abcshoesonlinestore.com/product it will redirect to abcshoes.com or its product page itself (i want to use 302 instead 301 (ASSUMING if the homapage or product page have manual penalization or anything bad we want to leave it behind and start fresh JUST assume because i read some post that 301 will carry any bad thing to new site too) The reason i do not want to 301 from abcshoesonlinestore.com to abcshoes.com is because those many pages is ranking top 3 in GOOGLE ( i worry will lose this ranking since this bringing traffic for us) Is this good idea or bad idea or any better idea or should i try to see the outcome 🙂 - the only concern is from abcshoesonlinestore.com to abcshoes.com will pose as duplicate content if i do not use 301 - or can i use google webmaster tools to remove the home page and product page for abcshoesonlinestore.com can we tell google that? PS: (home page and product page will have new revise content and minor design change) but inside page will stay the same design Please give me some advise
Intermediate & Advanced SEO | | owen20110 -
Google consolidating link juice on duplicate content pages
I've observed some strange findings on a website I am diagnosing and it has led me to a possible theory that seems to fly in the face of a lot of thinking: My theory is:
Intermediate & Advanced SEO | | James77
When google see's several duplicate content pages on a website, and decides to just show one version of the page, it at the same time agrigates the link juice pointing to all the duplicate pages, and ranks the 1 duplicate content page it decides to show as if all the link juice pointing to the duplicate versions were pointing to the 1 version. EG
Link X -> Duplicate Page A
Link Y -> Duplicate Page B Google decides Duplicate Page A is the one that is most important and applies the following formula to decide its rank. Link X + Link Y (Minus some dampening factor) -> Page A I came up with the idea after I seem to have reverse engineered this - IE the website I was trying to sort out for a client had this duplicate content, issue, so we decided to put unique content on Page A and Page B (not just one page like this but many). Bizarrely after about a week, all the Page A's dropped in rankings - indicating a possibility that the old link consolidation, may have been re-correctly associated with the two pages, so now Page A would only be getting Link Value X. Has anyone got any test/analysis to support or refute this??0