Moz Crawler not Identifying all Duplicate Pages
-
On two recent site crawls (9/27/14 and 11/4/14) for duplicate content the Moz tool did not ID the following 2 pages, which are 100% duplicate to each other:
http://www.hooksandlattice.com/planter-hampton-241212.html ; Screenshot: http://screencast.com/t/DdwWroUU
http://www.hooksandlattice.com/planter-hampton-721212.html ; Screenshot: http://screencast.com/t/8Lb1cJZmGrhX
As I'm working feverishly to re-write and update the site (goal is ZERO duplicates) I'm finding it challenging to use the Moz tool to get the project done. Does anyone have any feedback or help they can provide for how I can identify all duplicate pages associated with my domain?
Thank you!
Lindsey Pfeiffer
-
Hi Lindsey
Our engineers have confirmed that rogerbot will flag pages that are 100% identical but can sometimes miss pages that are 99% similar. The crawler is deliberately written to err on the side of not reporting false positives which means it sometimes can report false negatives which has occurred in your case. Using a combination of tools such as Webmaster tools can help isolate any pages we have missed.
Hope this helps!
-
Hey Lindsey!
I am not sure why our crawler did not flag those pages as they are 99% identical and are not sharing the same canonical URL. This is very strange and I'll send this up to our crawler engineer to obtain more insight.
Will let you know what I find out once I hear back!
-
Do you check Google Webmaster Tools? Under Search Appearance > HTML Improvements Google will list duplicate titles and descriptions among other things, which might be a help to you.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How To Stop Moz Crawl From Prepending /blog/ on all our site urls that it crawls
Hello, At some time in the past our WP site had urls like this: www.oursite.com/blog/post-title-pretty-link The site has not used that url structure for quite some time, but Moz crawl is still hitting every post with /blog/prepended and as a result is generating thousands of 404s. When the /blog/ is removed from the url, then the urls work fine. Where are those old urls being stored and how can we update them? How do we address this issue? Any assistance will be appreciated. Thanks!
Moz Bar | | dbcooper1 -
804 : HTTPS (SSL) error encountered when requesting page.
Hi there, I am attempting a crawl test and am getting "804 : HTTPS (SSL) error encountered when requesting page." with no further information to debug the issue. Site seems to be fine as far as SSL configuration goes. How can we get more debugging information? Thanks,
Moz Bar | | RickEH
Rick0 -
Duplicate Page and Title Issues
On the last crawl, we received errors for duplicate page titles and some duplicate content pages. Here is the issue: We went through our page titles that were marked as duplicate and changed them to make sure their titles were different. However, we just received a new crawl this week and it is saying there are even more duplicate page title errors detected than before. We're wondering if this is a problem with just us or if it has been happening to other Moz users. As for the duplicate content pages, what is the best way to approach this and see what content is being looked at as a "duplicate" set?
Moz Bar | | Essential-Pest0 -
Why does the moz crawl test lists page twice?
Hi, I'm running into an issue where some crawlers list my pages twice, once with a trailing slash, once without. I first saw it on a few pages with screaming frog, then saw it happen on all my pages with the moz crawler. The site is www.kidsandart.org and its on Squarespace. I grepped the sitemap.xml I submitted to google webmaster and got 167 distinct pages, all of them without a trailing slash. Any insights on why this is happening, and how to regard moz crawler results would be appreciated. thanks Tom
Moz Bar | | tpushpathadam0 -
MOZ crawl test is not reporting on all the pages on my site.
I've run the crawl test one of the sites I've taken over SEO for, however its only picking all the pages. For instance it indexes all the pages under xxxxx/us but none under xxxxx/au or xxxxx/uk The pages are being indexed as they're ranking in Google. Thanks.
Moz Bar | | ahyde0 -
Chrome moz toolbar page analysis not loading
Often the chrome moz toolbar page analysis doesn't load just says Please wait until the page finishes loading.
Moz Bar | | genkee0 -
On Page Grader can't access my URLs
HI- I am trying to grade some specific pages for keywords with the on page grader but it keeps telling me "Sorry, but that URL is inaccessible. " I can reach them via the browser and they are not https. Any thoughts? Here is a sample: www.bulkcandystore.com/kosher-candy Any help is appreciated. Ken
Moz Bar | | CandymanKen0 -
Can Moz use canconical links to prevent notices about duplicate content issues?
if so how do we enable this - we've an average size site with a few hundred products but they appear in multiple categories, canonical url points to it's primary category (but a new page exists for each section... so for /cat-a/abc there will be another page cat-b/abc and again but the canonical points to cat-a always for that product) basically I see this kind of duplication error / notice as a false positive... help me
Moz Bar | | SEOAndy0