I want to create a report of only de duplicate content pages as a csv file so i can create a script to canonicalize them.
-
I want to create a report of only de duplicate content pages as a csv file so i can create a script to canonicalize them. So i get something like:
http://example.com/page1, http://example.com/page2, http://example.com/page3, http://example.com/page4,
Because I now have to open each in "Issue: Duplicate Page Content", and this takes a lot of time.
The same for duplicate page title.
-
Hi nvs.nim,
could you tell me what you did differently? I also get an empty AF column.
-
Thanks! Because excel didn't seperate the fields right i didn't have the column AF. But i got it now! Thanks a lot!
-
Josh is right - when you export as CSV there should be a column in the spreadsheet -
|
duplicate_page_content
This column contains all the URLS that are considered duplicates
|
-
Yes it does, in column AF there is a list of Duplicate Page Content URLs
-
It doesn't tell me what other pages are identical. Only that there are identical pages.
-
Well.. SEOMoz Pro does it! Just check out the Crawl Diagnostics -> Duplicate Page Content then go to the top right and Export as CSV!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content - Multiple URL's
I know a few of these problems come from products being in the same categories but I have no idea how to get rid of the url's that are showing duplicate content when the product is in the exact same place. Hard to explain, but here are URL examples. http://www.ocelco.com/store/pc/www.ocelco.com/store/pc/Bathtub-Floor-Corner-Stainless-Steel-Grab-Bar-Right-Hand-left-hand-pictured-688p3308.htm http://www.ocelco.com/store/pc/www.ocelco.com/store/pc/Bathtub-Floor-Corner-Stainless-Steel-Grab-Bar-Right-Hand-left-hand-pictured-696p3308.htm http://www.ocelco.com/store/pc/Bathtub-Floor-Corner-Stainless-Steel-Grab-Bar-Right-Hand-left-hand-pictured-p3308.htm http://www.ocelco.com/store/pc/Bathtub-Floor-Corner-Stainless-Steel-Grab-Bar-Right-Hand-left-hand-pictured-688p3308.htm Any Idea's how to fix / get rid of these URL's? Thanks!
Moz Pro | | Mike.Bean0 -
To Worry or Not? Duplicate Content Created from Redirect After Login
One of my Moz reports is flagging duplicate content. For example, https://redchairmarket.com/Account/LogOn?ReturnUrl=%2FAccount%2FSaveSearch%3FsearchId%3D0&searchId=0 and https://redchairmarket.com/Account/LogOn?ReturnUrl=%2FAccount%2FSaveSearch%3FsearchId%3D1&searchId=1 are created when a user logs in and the website sends them back to the page they were looking at before. What is the best way to deal with this duplicate issue? How serious is it? Thank you!
Moz Pro | | BrittanyHighland0 -
What's my best strategy for Duplicate Content if only www pages are indexed?
The MOZ crawl report for my site shows duplicate content with both www and non-www pages on the site. (Only the www are indexed by Google, however.) Do I still need to use a 301 redirect - even if the non-www are not indexed? Is rel=canonical less preferable, as usual? Facts: the site is built using asp.net the homepage has multiple versions which use 'meta refresh' tags to point to 'default.asp'. most links already point to www Current Strategy: set the preferred domain to 'www' in Google's Webmaster Tools. set the Wordpress blog (which sits in a /blog subdirectory) with rel="canonical" to point to the www version. Ask programmer to add 301 redirects from the non-www pages to the www pages. Ask programmer to use 301 redirects as opposed to meta refresh tags & point all homepage versions to www.site.org. Does this strategy make the most sense? (Especially considering the non-indexed but existent non-www pages.) Thanks!!
Moz Pro | | kimmiedawn0 -
404 : Errors in crawl report - all pages are listed with index.html on a WordPress site
Hi Mozers, I have recently submitted a website using moz, which has pulled up a second version of every page on the WordPress site as a 404 error with index.html at the end of the URL. e.g Live page URL - http://www.autostemtechnology.com/applications/civil-blasting/ Report page URL - http://www.autostemtechnology.com/applications/civil-blasting/index.html The permalink structure is set as /%postname%/ For some reason the report has listed every page with index.html at the end of the page URL. I have tried a number of redirects in the .htaccess file but doesn't seem to work. Any suggestions will be strongly appreciated. Thanks
Moz Pro | | AmanziDigital0 -
Why do I see a duplicate content errors when rel="canonical" tag is present
I was reviewing my first Moz crawler report and noticed the crawler returned a bunch of duplicate page content errors. The recommendations to correct this issue are to either put a 301 redirect on the duplicate URL or use the rel="canonical" tag so Google knows which URL I view as the most important and the one that should appear in the search results. However, after poking around the source code I noticed all of the pages that are returning duplicate content in the eyes of the Moz crawler already have the rel="canonical" tag. Does the Moz crawler simply not catch whether that tag is being used? If I have that tag in place, is there anything else I need to do in order to get that error to stop showing up in the Moz crawler report?
Moz Pro | | shinolamoz0 -
Crawl Diagnostics returning duplicate content based on session id
I'm just starting to dig into crawl diagnostics and it is returning quite a few errors. Primarily, the crawl is indicating duplicate content (page titles, meta tags, etc), because of a session id in the URL. I have set-up a URL parameter in Google Webmaster Tools to help Google recognize the existence of this session id. Is there any way to tell the SEOMoz spider the same thing? I'd like to get rid of these errors since I've already handled them for the most part.
Moz Pro | | csingsaas0 -
Traffic reports
this may be a realy silly question to ask but when im trying to print off traffice report I can only see reports based on my keywords is their a way to show all traffic not just based on my keywords?
Moz Pro | | seoactivejames0 -
How do you add your logo to reports?
I hate to ask such a simple question but I have a good look around and cannot find the solution. I have the right kind of seomoz account....could someone point me in the right direction? Thnank you.
Moz Pro | | dv8media0