Sorting Dupe Content Pages
-
Hi,
I'm no excel pro, and I'm having a bit of a challenge interpreting the Crawl Diagnostics export .csv file.
I'd like to see at a glance which of my pages (and I have many) are the worst offenders for dupe content – ie. which have the most "Other URLs" associated with them.
Thanks, would appreciate any advice on how other people are using this data, and/or how 'Moz recommends to do it.
-
CMC is correct - thats how I do it for larger sites.
- delete all columns except the URL column (col A) and the duplicate pages column (now Col B)
- in cell C2, enter this formula: =len(b2) it will calculate the characters in dupe pages cell
- drag that cell down to last row
- select all three columns and sort col c by largest to smallest
Obviously this isn't going to give you an exact number of dupe pages since URL text strings can vary in length, but it does give you a pretty good idea of the worst offenders....
-
I've found this a little frustrating, too. The display on the web will show the number of duplicate URLs, but the exported spreadsheet does not. It does, however, list all of the duplicate URLs in one cell -- so you could calculate the character length of that cell and then sort by that column, and that would give you a rough ranking.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is my page ranking lower when my moz stats are higher?
There are several keywords that I search for and I notice we rank lower than a couple of competitors. When I use the mozbar and view the stats, I see that are PA and DA are higher and in some instances the ones who rank higher don't even show all the keywords in their meta descriptions. If this is the case, how are they still ranking higher?
Moz Pro | | Smart_Start0 -
Duplicate Content
My website is hosted by Hubspot. With each blog I write I can tag them to be listed in a specific category. As an example, one blog article my have three tags or categories that it fits in. Seomoz is seeing this as a duplication of content. in other words, if you go to the different category pages the same article would be listed on all three pages, even though it is just one article. However, I only have 36 duplicate content warnings and I have 150 blog articles, each having 2 or 3 tags (categories.), so there should be many more than 36 duplications. Is this something that affects my seo, or should I just ignore the problem and check these warnings as fixed? Thanks,
Moz Pro | | Rong
Ron0 -
Duplicate Content
Crawl Diagnostics is returning duplicate content/title tags for every product image on listing pages of my classified site because each image is on a separate url. So this page, for example, http://marketplace.myclassicgarage.com/cars/all/Chevrolet-Bel-Air/24481/ has, among other things, the same title tag as all this page, http://marketplace.myclassicgarage.com/cars/all/Chevrolet-Bel-Air/24481/media/151968 which is one of many different images that are all child pages in the folder /media In this particular case there are over 140 pages with the same title tag because there are over 140 images for this particular car. That is just one listing and there are over 1,000 listings (vehicles) and that number will grow. Is this really a problem? With limited resources, what real positive effect will making all these images have unique title tags really have from a SERP perspective? Keep in mind this being user generated content, there is no way to descriptively update the title tags to something like <title>Bel Air Passenger Side Profile</title>. That is not feasible.
Moz Pro | | MyClassicGarage0 -
How to solve duplicate page title & content error
I got lot of errors in Duplicate page title - 5000 Here the result page is same and content is also same,but it differs only with page no in meta title Title missing error In seomoz report i got empty msg - title,meta desc,meta robots,meta refresh But if i check the link which i got error it shows all meta tags..we have added all meta tags in our site..But i dont no why i got title missing error . 404 error In this report,if i click the link which i got error, it goes to main page of our site. But the url differs. eg: The error link is :www.example.com/buy/requirement-2-0-inmumbai-property it automatically goes to www.example.com page Let me know how to solve these issues.
Moz Pro | | Rajesh.Chandran0 -
SEOmoz duplicate content checker
From my reports in seomoz i can see pages that are showing as having duplicate content but when i click on them it does not show me which pages are carrying the duplicate content? Is there any way to check this via semoz reports?
Moz Pro | | jazavide0 -
Is there a easy way to see what pages are crawled?
Hello! Like the questions says... Is there a easy way to see what pages are crawled? I don't mean the ones that have issues, but just the ones that have been crawled? Regards,
Moz Pro | | MattDG0 -
Roger keeps telling me my canonical pages are duplicates
I've got a site that's brand spanking new that I'm trying to get the error count down to zero on, and I'm basically there except for this odd problem. Roger got into the site like a naughty puppy a bit too early, before I'd put the canonical tags in, so there were a couple thousand 'duplicate content' errors. I put canonicals in (programmatically, so they appear on every page) and waited a week and sure enough 99% of them went away. However, there's about 50 that are still lingering, and I'm not sure why they're being detected as such. It's an ecommerce site, and the duplicates are being detected on the product page, but why these 50? (there's hundreds of other products that aren't being detected). The URLs that are 'duplicates' look like this according to the crawl report: http://www.site.com/Product-1.aspx http://www.site.com/product-1.aspx And so on. Canonicals are in place, and have been for weeks, and as I said there's hundreds of other pages just like this not having this problem, so I'm finding it odd that these ones won't go away. All I can think of is that Roger is somehow caching stuff from previous crawls? According to the crawl report these duplicates were discovered '1 day ago' but that simply doesn't make sense. It's not a matter of messing up one or two pages on my part either; we made this site to be dynamically generated, and all of the SEO stuff (canonical, etc.) is applied to every single page regardless of what's on it. If anyone can give some insight I'd appreciate it!
Moz Pro | | icecarats0 -
How to find highest PR pages by search term?
is there a search operator in google to sort by highest PR, or an alternate search engine (besides Google directory) to find high PR sites by keyword? I am sending free product out to webmasters to use as a giveaway for their audience, and would like to find the highest PR sites/pages to partner with. I want the highest SEO value out of the products I send out. I'm trying to find outdoor sports related sites, etc, which have done contests/giveaways in the past and linked well to the supplier of the prize. I've been just googling phrases like "backpack giveaway", etc. and this has worked ok, but I'd like to find a tool to search for the highest PR sites around each of the phrases.
Moz Pro | | rakesh_patel0