Moz Crawler not Identifying all Duplicate Pages
-
On two recent site crawls (9/27/14 and 11/4/14) for duplicate content the Moz tool did not ID the following 2 pages, which are 100% duplicate to each other:
http://www.hooksandlattice.com/planter-hampton-241212.html ; Screenshot: http://screencast.com/t/DdwWroUU
http://www.hooksandlattice.com/planter-hampton-721212.html ; Screenshot: http://screencast.com/t/8Lb1cJZmGrhX
As I'm working feverishly to re-write and update the site (goal is ZERO duplicates) I'm finding it challenging to use the Moz tool to get the project done. Does anyone have any feedback or help they can provide for how I can identify all duplicate pages associated with my domain?
Thank you!
Lindsey Pfeiffer
-
Hi Lindsey
Our engineers have confirmed that rogerbot will flag pages that are 100% identical but can sometimes miss pages that are 99% similar. The crawler is deliberately written to err on the side of not reporting false positives which means it sometimes can report false negatives which has occurred in your case. Using a combination of tools such as Webmaster tools can help isolate any pages we have missed.
Hope this helps!
-
Hey Lindsey!
I am not sure why our crawler did not flag those pages as they are 99% identical and are not sharing the same canonical URL. This is very strange and I'll send this up to our crawler engineer to obtain more insight.
Will let you know what I find out once I hear back!
-
Do you check Google Webmaster Tools? Under Search Appearance > HTML Improvements Google will list duplicate titles and descriptions among other things, which might be a help to you.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How Can I Batch Upload URLs to get PA for many pages?
Howdy folks, I'm using advanced search operators to generate lists of long tail queries related to my niche and I'd like to take the batch of URLs I've gathered and upload them in a batch so I can see what the PA is for each URL. This would help me determine which long tail query is receiving the most love and links and help inform my content strategy moving forward. But I can't seem to find a way to do this. I went to check out the Moz API but it's a little confusing. It says there's a free version, but then it looks like it's actually not free, then I try to use it and it says I've gone over my limit even though I haven't used it yet. Anyone that can help me with this, I'd really appreciate it. If you're familiar with SEMRush, they have a batch analysis tool that works well, but I ideally want to upload these URLs to Moz because it's better for this kind of research. Thanks!
Moz Bar | | brettmandoes2 -
Moz Crawler Causing Server Timeouts... Crawling thousands of non-existant pages with query parameters
Moz crawler is crawling all pages like this: http://www.xxxx.com/?product_count=100&product_order=desc&product_orderby=date http://www.xxxx.com/?product_count=100&product_order=desc&paged=1 http://www.xxx.com/?product_count=100&product_order=desc&product_view=grid Last month it crawled 80,000 pages on a site with less than 100 pages. Is there a way to select only certain pages to be crawled? Right now it is still crawling this site, since Monday morning and it's Tuesday mid-day. Every Monday it is causing time-outs from high band width on our server. Just getting ready to delete this client from the account unless there is a solution someone can give us. Thanks.
Moz Bar | | adirondack0 -
The Page Optimization tool keeps asking for several changes that are already in place! How can I get it to recognize them?
Hi there...the Page Optimization tool shows a 71 score for one of my pages, but the most critical needs it noted have already been in there for some time. What's the deal with this? Thanks...
Moz Bar | | adirondack0 -
4 days waiting for a Moz Crawl - How quick are yours?
Hi there Please could anyone say how long they have been waiting for crawl results. I requested a crawl on a 20 page website and I have been waiting 4 days since last weekend. I checked Moz Health and there have been no related issues there: http://health.moz.com/ Your response would be welcome. Thanks
Moz Bar | | SEOguy10 -
Moz Page Analysis Country different to Who.is?
If I analyse a domain with Moz Page Analysis tool, it says that the domain is hosted in the United States but if look up the same domain on who.is, the hosting location is Italy?
Moz Bar | | Marketing_Today0 -
Duplicate Page and Title Issues
On the last crawl, we received errors for duplicate page titles and some duplicate content pages. Here is the issue: We went through our page titles that were marked as duplicate and changed them to make sure their titles were different. However, we just received a new crawl this week and it is saying there are even more duplicate page title errors detected than before. We're wondering if this is a problem with just us or if it has been happening to other Moz users. As for the duplicate content pages, what is the best way to approach this and see what content is being looked at as a "duplicate" set?
Moz Bar | | Essential-Pest0 -
On Page Grader not working on the fly
On Page Grader has this lovely little re-grade page button next to the URL and term field but it's never worked on the fly for me. What's going on here? You discover something critical, fix it and you want the satisfaction of re-grading to test it and get that A but it doesn't work. I can see the changes on the site so why not??
Moz Bar | | wearehappymedia0 -
Moz crawler
I have a site which is in a non production status. Crawlers are blocked vis robot txt. User-agent: *
Moz Bar | | Emanuele_Ricci
Disallow: / I WANT TO MAKE A CRAWLING TEST WITH MOZ CRAWLER (RogerBot) ,
how can I allow your crawler to get in and prevent other crawlers from indexing the site? Thanks memok0