Changing the way SEOmoz Detects Duplicate Content
-
Hey everyone,
I wanted to highlight today's blog post in case you missed it. In short, we're using a different algorithm to detect duplicate pages. http://moz.com/blog/visualizing-duplicate-web-pages
If you see a change in your crawl results and you haven't done anything, this is probably why. Here's more information taken directly from the post:
1. Fewer duplicate page errors: a general decrease in the number of reported duplicate page errors. However, it bears pointing out that:
- **We may still miss some near-duplicates. **Like the current heuristic, only a subset of the near-duplicate pages is reported.
- **Completely identical pages will still be reported. **Two pages that are completely identical will have the same simhash value, and thus a difference of zero as measured by the simhash heuristic. So, all completely identical pages will still be reported.
2. Speed, speed, speed: The simhash heuristic detects duplicates and near-duplicates approximately 30 times faster than the legacy fingerprints code. This means that soon, no crawl will spend more than a day working its way through post-crawl processing, which will facilitate significantly faster delivery of results for large crawls.
-
That is good news. It will ease some minds that are going nuts over the duplicate content reporting. Thanks!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Found hidden pages of outbound links created via ex-SEO consultant. Best way to detect other possible problems?
We paid for an SEO contract in addition to our new website design (same company did both) and after 12 months cancelled the SEO. I have been very suspicious ever since of our bad page rank and general lack of traffic (despite my efforts) and today found a hidden page of outbound links. Currently in shock that this happened although my own fault for lack of due diligence. The SEO consultants were very unhappy that I cancelled the contract so I am worried about the extent of bad links or negative google juice they may have created (god knows what else they may have done). So my questions are: How can I detect any other (potentially hidden) problems? How can I recover from this - any right/wrong way to approach google? What is the best way to bring this up with the SEO consultants? Thank you in advance.
Moz Pro | | marketing-gal0 -
Seomoz legacy pages?
Hello, I am finding that I miss several of the old seomoz sections. The legacy tools in particular like the visual website comparison. Where is that now? Also, where is the ongoing list of the top 100 sites? So much was lost in the shift to MOZ, I hope some of the good old stuff is still available. Thank you, Nolan
Moz Pro | | QuietProgress0 -
Is there a way to get the domain authority of a subdomain?
For example, www.storage-mart.com and us.storage-mart.com both show the same DA in OSE. I believe it's the DA of the root domain. How can I compare the DA of the 'www' and 'us' subdomains?
Moz Pro | | Whitespark0 -
Seomoz and site explorer completely different
seomoz and site explorer completely different results on own site and competitors on moz i am ahead in all aspects in site explorer i am way off par...why
Moz Pro | | lordmoose85450 -
Does SEOMoz ever work?
Hi, I've signed up for the free 30 day trial and I'm on the edge of not actually subscribing to the service. I go through the Q & A boards which I find really interesting and hope I can add value there in the future, but the tools interest me more (and this is where the issue lies. Do they ever work? The Keyword difficulty tool just constantly says to come back in 20 minutes and I don't think the Rank Tracker has worked for at least half my freebie 30 days. Have the tools always been this flaky or is it a blip?
Moz Pro | | orlandovisiting1 -
Archiving Campaigns in SEOmoz
First off, I love the campaign archive feature. Very useful for my purposes. My question is: Is there a limit to how many campaigns I can archive? Thanks in advance!
Moz Pro | | CollinJarman0 -
Seomoz & Duplicate Page Content Issue?
Hi, What is the criteria on Seomoz Crawl Diagnostic Report? I got a long list of URLs indicating Content that is identical (or nearly identical) to content on other pages of your site forces your pages to unnecessarily compete with each other for rankings. But as I gone through none of the reported pages duplicate. What should I do? Thanks in Advance
Moz Pro | | VipinLouka780 -
What the hell...spam on SEOMOZ!
I received this in my Private Messages section: My name is Fatima,i saw your profile at/www.seomoz.org/today and became intrested in you,i will also like to know you the more,and i want you to send an email to my email address so i can give you my picture for you to know whom i am.Here is my email address (fatimababy06@yahoo.com) I believe we can move from here I am waiting for your mail to my email address above.Fatima(Remeber the distance or colour does not matter but love matters alot in life) How can somebody spam like this on protected forum?
Moz Pro | | IM_Learner2