Duplicate Content - Bulk analysis tool?
-
Hi
I wondered if there's a tool to analyse duplicate content - within your own site or on external sites, but that you can upload the URL's you want to check in bulk?
I used Copyscape a while ago, but don't remember this having a bulk feature?
Thank you!
-
Great thank you!
I'll give both a go!
-
Great thanks
Yes I use screaming frog for this, but it was to look at actual page content. So yes to see if sites copy our content, but also to see whether we need to update our product content as some products are very similar.
I'll check the batch process on copyscape thanks!
-
I have not used this tool in this way, but have used it for other crawler projects related to content clean up and it is rock solid. They have been very responsive to me on questions related to use of the software. http://urlprofiler.com/
Duplicate content search is the project next on my list, here is how they do it.
http://urlprofiler.com/blog/duplicate-content-checker/
You let URL profiler crawl the section of your site that is most likely to be copied (say your blog) and you tell URL profiler what section of your HTML to compare against (i.e. the content section vs the header or footer). URL profiler then uses proxies (you have to buy the proxies) to perform Google searches on sentences from your content. It crawls those results to see if there is a site in the Google SERPs that has sentences from your content word for word (or pretty close).
I have played with Copyscape, but my markets are too niche for it to work for me. The logic here from URL profilers is that you are searching the database that most matters, Google.
Good luck!
-
I believe you might be able to use List Mode in ScreamingFrog to accomplish this, however it depends on ultimately what your goal is to check for duplicate content. Do you simply want to find duplicate titles or duplicate descriptions? Or do you want to find pages with sufficiently similar text as to warrant concern?
== Ooops! ==
It didn't occur to me that you were more interested in duplicate content caused by other sites copying your content rather than duplicate content among your list of URLs.
Copyscape does have a "Batch Process" tool but it is only available to paid subscribers. It does work quite nicely though.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content
Hello Moz Quick question. Can I copy and paste a paragraph of text (100 words) from my main category page into my products without hurting SEO of the category page? The content on my category page is so good I don't want to take chances as this is what I will be ranking for. Thanks
On-Page Optimization | | crocman0 -
Online classified ads site - duplicate content?
Hello, I was reading hobo s post on duplicate content. Our web is in the classified advertisement industry and our site is built up like this Homepage (last 200 ads) category 1(has the name we want to rank our homepage and around 350 ads) category 2 (around 100 ads) category 3 (around 60 ads) Now our homepage has 200 ads that also appear mostly in category 1 but also in others. We are ranking our homepage as 11 th now on Google. I'm worried a bit that the 200 ads on the homepage are not unique, because they will appear in one other category. Is this OK? Is this duplication? Should we do something? Issue is that we at first started ranking our homepage where all ads were, now there are too many so we show 200 latest on homepage and then they are split into category pages.
On-Page Optimization | | advertisingcloud0 -
Exclude sorting options using nofollow to reduce duplicate content
I'm getting reports of duplicate content for pages that have different sorting options applied, e.g: /trips/dest/africa-and-middle-east/
On-Page Optimization | | benbrowning
/trips/dest/africa-and-middle-east/?sort=title&direction=asc&page=1
/trips/dest/africa-and-middle-east/?sort=title&direction=des&page=1 I have the added complication of having pagination combined with these sorting options. I also don't have the option of a view all page. I'm considering adding rel="nofollow" to the sorting controls so they are just taken out of the equation, then using rel="next" and rel="prev" to handle the pagination as per Google recommendations(using the default sorting options). Has anyone tried this approach, or have an opinion on whether it would work?0 -
Duplicate title tags, how to solve that?
We are currently running the "yellow pages". The problem is that Google Webmasters reports a lot of duplicate title tags. It's because we have three languages and the title consists of company name. for example: FCR Media Lietuva, UAB (The same in all languages). Of course we make different meta desriptions and so on. How should we solve this problem or should be just leave it as it is?
On-Page Optimization | | FCRMediaLietuva0 -
Duplicate Content aka 301 redirect from .com to .com/index.html
Moz reports are telling me that I have duplicate content on the home page because .com and .com/index.html are being seen as two pages. I have implemented 301 redirect using various codes I found online, but nothing seems to work. Currently I'm using this code. RewriteEngine On
On-Page Optimization | | omakad
RewriteBase /
RewriteCond %{HTTP_HOST} ^jacksonvilleacservice.com
RewriteRule ^index.html$ http://www.jacksonvilleacservice.com/ [L,R=301] Nothing is changing. What am I doing wrong? I have given it several weeks but report stays the same. Also according to webmasters tools they can't see this as duplicate content. What am I doing wrong?0 -
Impact of rogue keyword in content
I have a page that is optimised - title, URL, content etc for the chosen keywords. However, within the content are some batches of bullet point text that has repeated text throughout. So for example I have 5 instances of my chosen keyword within the content and 24 instances of the two word text within the bullet points. Does this kind of scenario have any impact on ranking?
On-Page Optimization | | MickEdwards0 -
The seomoz on page keyword analysis tool is not showing title or keyword in document
the SEOMOZ onpage analysis tool is not not showing title or keyword for any page in one of my sites. It says there are no title elements on my page and there are, i checked the source code myself and they are there and correct. my title and keywords are in there and show up fine in firefox and internet explorer even after i refresh them. why would this tool show them as missing in one of my sites but not others? I'm worried that google's spider might not see them if the on page analyzer doesn't see them and my rankings might drop. they showed up the other day in the seomoz on page analyzer just fine and i haven't changed anything. Thanks mozzers!
On-Page Optimization | | Ron100 -
Duplicate Page Content on Empty Manufacturer Pages
I work for an internet retailer that specializes in pet supplies and medications. I was going through the Crawl Diagnostics for our website, and I saw in the Duplicate Page Content section that some of our manufacturer pages were getting flagged. The way our site is set up is that when products are discontinued we mark them as discontinued and use 301 redirects to redirect their URLs to other relevant products, brands, or our homepage. We do the same thing with brand and manufacturer pages if all of their products are discontinued. 90% of the time, this is a manual process. However, the other 10% of the time certain products come and go automatically as part of our inventory system with one of our fulfillment partners. This can sometimes create empty manufacturer pages. I can't redirect these empty pages because there's a chance that products will be brought back in stock and the page will be populated again. What can we do so that these pages won't get marked as duplicates while they're empty? Write unique short descriptions about the companies? Would the placement of these short descriptions matter--top of the page under the category name vs bottom of the page underneath where the products would go? The links in the left sidebar, top, and in the footer our part of our site architecture, so those are always going to be the same. To contrast, here's what a manufacturer page with products looks like: Thanks! http://www.vetdepot.com/littermaid-manufacturer.html
On-Page Optimization | | ElDude0