Duplicate page report
-
We ran a CSV spreadsheet of our crawl diagnostics related to duplicate URLS' after waiting 5 days with no response to how Rogerbot can be made to filter.
My IT lead tells me he thinks the label on the spreadsheet is showing “duplicate URLs”, and that is – literally – what the spreadsheet is showing.
It thinks that a database ID number is the only valid part of a URL. To replicate: Just filter the spreadsheet for any number that you see on the page. For example, filtering for 1793 gives us the following result:
|
URL
http://truthbook.com/faq/dsp_viewFAQ.cfm?faqID=1793
http://truthbook.com/index.cfm?linkID=1793
http://truthbook.com/index.cfm?linkID=1793&pf=true
http://www.truthbook.com/blogs/dsp_viewBlogEntry.cfm?blogentryID=1793
http://www.truthbook.com/index.cfm?linkID=1793
|
There are a couple of problems with the above:
1. It gives the www result, as well as the non-www result.
2. It is seeing the print version as a duplicate (&pf=true) but these are blocked from Google via the noindex header tag.
3. It thinks that different sections of the website with the same ID number the same thing (faq / blogs / pages)
In short: this particular report tell us nothing at all.
I am trying to get a perspective from someone at SEOMoz to determine if he is reading the result correctly or there is something he is missing?
Please help. Jim
-
Hi Jim!
Thanks for the question. One thing we should clarify before we move forward is that the Pro app doesn't actually report on duplicate URLs, but we do report when we find duplicate title tags or content.
Duplicate titles just refer to when we find the same title tag on more than one page. In one example from your diagnostics, we're reporting the title tag 'Truthbook Religious News' is being used in multiple pages (http://screencast.com/t/GYCKNfAoj).
Duplicate content is content we see on the source code of your pages that is identical or nearly identical and would cause the pages to compete against each other for rankings. To fix either of these you have a several options:
- Set up a 301 redirect to have the pages you would consider duplicate redirect to the main page.
- Change the content/title tags enough that they won't be considered duplicates - Canonicalize the content you would consider duplicates.
Most developers will go for the latter two options so that the pages will still be reachable by visitors. You can find out more about how to implement these in our Help Hub.
To answer your other questions:
1 - At the time of the crawl, we were able to get to sub domain pages from other pages on your site. The sub domains were also resolving separately, but they seem to be redirecting to your root domain now, so your next crawl should reflect this.
2 - Running a curl for the print versions of your pages, I see "no follow" tags related to Wikipedia links embedded (http://screencast.com/t/reYjeLLPvWG3) in the doc, but I'm not finding any "no index tags" (http://screencast.com/t/DsXMZInngSzH). This would be why you're seeing us crawling those pages.
3 - As I mentioned above, our crawler looks for similarities in the source code of pages when reporting on duplicate content. Since no one knows exactly how similar content would need to be for the search engines to consider it a duplicate, we err on the side of caution and recommended best practices when reporting them. Using one of the methods mentioned above and detailed in our Help Hub should resolve this for you
Let me know if you have any other questions!
Best,
Sam
Moz Helpster - Set up a 301 redirect to have the pages you would consider duplicate redirect to the main page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
'Duplicate Page Content' for dissimilar pages
I'm using Moz's Crawl Diagnostics to try and clean up some SEO priorities for our website (http://www.craftcompany.co.uk) HOWEVER, virtually all of the pages that are being categorised as duplicate content are not the same, or indeed similar. For instance, these three pages have been deemed duplicated pages; http://www.craftcompany.co.uk/pme-rose-leaf-veined-plunger.html http://www.craftcompany.co.uk/double-faced-satin-ribbon-black-25mm-wide.html http://www.craftcompany.co.uk/double-faced-satin-maroon-10mm-wide-25mt.html Can anyone give me an insight into why this is? Many Thanks! http://www.craftcompany.co.uk/
Moz Pro | | The_Craft_Company0 -
Moz crawl duplicate pages issues
Hi According to the moz crawl on my website I have in the region of 800 pages which are considered internal duplicates. I'm a little puzzled by this, even more so as some of the pages it lists as being duplicate of another are not. For example, the moz crawler considers page B to be a duplicate of page A in the urls below: Not sure on the live link policy so ive put a space in the urls to 'unlive' them. Page A http:// nuchic.co.uk/index.php/jeans/straight-jeans.html?manufacturer=3751 Page B http:// nuchic.co.uk/index.php/catalog/category/view/s/accessories/id/92/?cat=97&manufacturer=3603 One is a filter page for Curvety Jeans and the other a filter page for Charles Clinkard Accessories. The page titles are different, the page content is different so Ive no idea why these would be considered duplicate. Thin maybe, but not duplicate. Like wise, pages B and C are considered a duplicate of page A in the following Page A http:// nuchic.co.uk/index.php/bags.html?dir=desc&manufacturer=4050&order=price Page B http:// nuchic.co.uk/index.php/catalog/category/view/s/purses/id/98/?manufacturer=4001 Page C http:// nuchic.co.uk/index.php/coats/waistcoats.html?manufacturer=4053 Again, these are product filter pages which the crawler would have found using the site filtering system, but, again, I cannot find what makes pages B and C a duplicate of A. Page A is a filtered result for Great Plains Bags (filtered from the general bags collection). Page B is the filtered results for Chic Look Purses from the Purses section and Page C is the filtered results for Apricot Waistcoats from the Waistcoat section. I'm keen to fix the duplicate content errors on the site before it goes properly live at the end of this month - that's why anyone kind enough to check the links will see a few design issues with the site - however in order to fix the problem I first need to work out what it is and I can't in this case. Can anyone else see how these pages could be considered a duplicate of each other please? Checking ive not gone mad!! Thanks, Carl
Moz Pro | | daedriccarl0 -
Pages with Temporary Redirect (CTA)
I had MOZ crawl my site and I had 5 CTA pages with a temporary redirect. How do I correct the issue? Thank You! -Nick
Moz Pro | | X2Metrology10 -
Duplicate content analysis
Good morning everyone, I have just run a test from SEOmoz PRO tool and I got more than 2,000 double content errors. How might I see which are the 2 pages whose content is double? My website is just newly revamped and can't find these similarities on my own: for me there are not). Thanks for your help! Francesca
Moz Pro | | astojanov0 -
Plurals and the SEOmoz On Page Report Card
I have some pages on my website that receive an "A" grade with the SEOmoz On Page Report Card if I use a plural term in a phrase (say, "landscape designers in Dallas") but an "F" grade if I use the singular term (say, "landscape designer in Dallas"). Is this a true and accurate reflection of how search engines see the page? Am I going to rank well for plural terms but not for singular, given the specific on-page content I'm using in this example? Or is this an opportunity for SEOmoz to tweak its' algorithm to better reflect the search engines?
Moz Pro | | ahirai1 -
How to read Crawler downloaded report
I am trying to seperate the duplicate title and description URLs, by looking at the report i am not getting how to find all urls which contain same title and description. Is there any video link on the site which walk me through each part of the report. Thanks, Punam
Moz Pro | | nonlinearcreations0 -
SEOMoz Crawling Only 1 Page
I entered a new site into my dashboard 2 days ago - everything looked kosher, there were a few hundred pages crawled and a whole bunch of errors. I came back this morning to start work on the site and SEOMoz has crawled the site again, this time returning only 1 page and 0 errors. I haven't even logged in to the site since the first crawl, so I couldn't have broken anything. Has anyone seen this before?
Moz Pro | | Junction0 -
Page and Domain Authority and other bits
Hi, I am in the process of finding blogs to have a few articles published with a couple of links in each. Articles will all be unique and relevant to the link I drop in and relevant in someway to the reader However I have a few questions. My site is a designer menswear site, so I have picked fashion and sports sites first and foremost to have the articles published. Now, I have a guy who owns about 30 different websites. 2 of them are sports based and about 10 are fashion based. Around $10-$15 an article. I have ran them all through the Open Site Explorer Tool and picked out the best ranked ones. Now my problem is, how do I know if its a good site to not only list an article on, but to pay for it as well. The sites page ranks are around the 30-45 range, the domains are around the 35-45 range. What is a good range to have? I know the higher the better but is 30-45 good enough to pay for? (I don't mind paying the $10 each (£7 my money) for each one) Also as he is quoted me in dollars, I assume there all USA based, so majority of users are USA based. Well I am UK based and only ship to the UK. Will this matter as much if I am trying to gain backlinks? Obviously a UK based site, would be ideal, but is it a case of getting more external links on the web for Google to find, as long as they are relevant to the user? Any help would be great. Thanks Will
Moz Pro | | WillBlackburn0