Changing the way SEOmoz Detects Duplicate Content
-
Hey everyone,
I wanted to highlight today's blog post in case you missed it. In short, we're using a different algorithm to detect duplicate pages. http://moz.com/blog/visualizing-duplicate-web-pages
If you see a change in your crawl results and you haven't done anything, this is probably why. Here's more information taken directly from the post:
1. Fewer duplicate page errors: a general decrease in the number of reported duplicate page errors. However, it bears pointing out that:
- **We may still miss some near-duplicates. **Like the current heuristic, only a subset of the near-duplicate pages is reported.
- **Completely identical pages will still be reported. **Two pages that are completely identical will have the same simhash value, and thus a difference of zero as measured by the simhash heuristic. So, all completely identical pages will still be reported.
2. Speed, speed, speed: The simhash heuristic detects duplicates and near-duplicates approximately 30 times faster than the legacy fingerprints code. This means that soon, no crawl will spend more than a day working its way through post-crawl processing, which will facilitate significantly faster delivery of results for large crawls.
-
That is good news. It will ease some minds that are going nuts over the duplicate content reporting. Thanks!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Pages
Hello, we have an issue which I'm hoping someone can help with. Our Moz system is saying that this page http://www.indigolittle.com/fees/ Is a duplicate page. We use this page purely for mobiles and we have added code to say This has been on for over a month now however Moz is still picking the page us as a High Priority Issue.
Moz Pro | | popcreativeltd0 -
Has the Hummingbird Changes Effected the accuracy of MOZ?
Hello, I am new to the forum. Can anyone tell me if the recent Hummingbird Changes have effected the accuracy of MOZ? Specifically, keyword comparison of competition. Thanks, Derek
Moz Pro | | Chris81980 -
Duplicate Page Title - although there are differences
Hello, I get duplicate page titles errors on pages in which there are little differences. For example: C++ Online Test for Seniors C# Online Test for Seniors I assume that from some reason the ++ and the # are removed when SEOMoz crawler checks for duplicate page titles. As you may know C# and C++ means two different programming languages. Should I do something about it or is it a bug in the crawler?
Moz Pro | | ulukach0 -
How to delete/redirect duplicate content
Hello, Our site thewealthymind(dot)com has a lot of duplicate content. How do you clear up duplicate content when there's a lot of it. The owners redid the site several times and didn't update the URLs. Thank you.
Moz Pro | | BobGW0 -
What happened to the SEOmoz Term Extractor Tool?
I am looking for the SEOmoz Term Extractor Tool and it's nowhere to be found. Does it exist anymore? If not why and what would be a good alternative tool to use? Thank you.
Moz Pro | | brianhughes0 -
Edit SEOmoz Profile Bug
Hi,
Moz Pro | | rayvensoft
I am trying to complete my SEOmoz profile, but there seems to be a bug. I am able to fill out everything except the 'favorite topics'. It lets me go in and fill it out, but when I click the update button to save it it does not save, and still shows up as missing in bar at the top. I have tried it in Mozilla, IE and Chrome, and it still does not work. Am I missing something? Thanks.0 -
How do I find the corresponding duplicate content pages from my SEOmoz report?
Once I have run my report and the duplicate content pages come up, is there a way to find out which pages have the duplicate content on them? I have one URL but where can I find the duplicate content that corresponds to it? Thanks Barry
Moz Pro | | MrBarrytg0 -
SEOMOZ Canonical notices using Wordpress
I keeping getting the notice from SEO Moz Crawls relating to Canonical issues. I have tried Yoast SEO, All-in-One SEO and both insert the appropriate canonical code... Can anyone help determine why the crawls report this notice? Check out seoontario.ca\testamonials for an example. Could it be because the site in my SEOMOZ crawl does not have the http:// prefix? I've now installed FV Simpler SEO, a variant of All In Once SEO, but am getting the same canonical code...
Moz Pro | | kbryanton0