Have Questions about the Jan. 27th Mozscape Index Update? Get Answers Here!
-
Howdy y'all. I wanted to give a brief update (not quite worthy of a blog post, but more than would fit in a tweet) about the latest Mozscape index update.
On January 27th, we released our largest web index ever, with 285 Billion unique URLs, and 1.25 Trillion links. Our previous index was also a record at 217 Billion pages, but this one is another 30% bigger. That's all good news - it means more links that you're seeking are likely to be in this index, and link counts, on average, will go up.
There are two oddities about this index, however, that I should share:
The first is that we broke one particular view of data - 301'ing links sorted by Page Authority doesn't work in this index, so we've defaulted to sorting 301s by Domain Authority. That should be fixed in the next index, and from our analytics, doesn't appear to be a hugely popular view, so it shouldn't affect many folks (you can always export to CSV and re-sort by PA in Excel if you need, too - note that if you have more than 10K links, OSE will only export the first 10K, so if you need more data, check out the API).
The second is that we crawled a massively more diverse set of root domains than ever before. Whereas our previous index topped out at 192 million root domains, this latest one has 362 million (almost 1.9X as many unique, new domains we haven't crawled before). This means that DA and PA scores may fluctuate more than usual, as link diversity are big parts of those calculations and we've crawled a much larger swath of the deep, dark corners of the web (and non-US/non-.com domains, too). It also means that, for many of the big, more important sites on the web, we are crawling a little less deeply than we have in the past (the index grew by ~31% while the root domains grew by ~88%). Often, those deep pages on large sites do more internal than external linking, so this might not have a big impact, but it could depend on your field/niche and where your links come from.
As always, my best suggestion is to make sure to compare your link data against your competition - that's a great way to see how relative changes are occurring and whether, generally speaking, you're losing or gaining ground in your field.
If you have specific questions, feel free to leave them and I'll do my best to answer in a timely fashion. Thanks much!
p.s. You can always find information about our index updates here.
-
Thanks Matt I'm proud of the team's work on growing the index thus far. I think we've reached the top of where we can go with the current index's infrastructure, so I'd expect sizes will stay in this range for the next 5-6 updates at least.
For the last 4 years, we have been working on a new infrastructure for our indices - something closer to what Google does with real-time processing via caffeine (though not quite as robust), and we're planning to launch that in Q4 of this year, at which time, our index can grow much bigger and much faster (it'll also be fresher, included lots more kinds of data, etc). That system also won't be limited by software (which holds us back today), but rather by hardware (which we can and will buy more of). I really can't wait for that
-
I second that opinion, super exciting to get deeper information! Can't wait to dive in!
-
I don't have a specific question, just a WOW! I remember when the index was getting smaller & smaller as you guys went through some "figuring out" of how exactly you would index the whole internets. It has come back in SUCH a big way!
I was thinking of your "false narratives" /rand blog post and how things didn't always go the way you wanted. OSE's limits have always been one of those "not the way I wanted it to be" with Moz and this size of update is an AMAZING comeback.
So, no question - just a bit of a "great job, team!" to get the index to this size. Can't wait for EVEN MORE.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is everybody seeing DA/PA-drops after last MOZ-api update?
Hi all, I was wondering what happend at the last MOZ-update. The MOZ-health page has no current errors and i've checked a lot of websites probably about 50 a 60 from our customers and from our competitiors and everybody seems to have a decrease in DA's / PA's. Not just a couple went up and down like normally would happen, but all seems to have dropped.... Is anyone seeing the same in their campaigns? Greetings,
API | | NielsVos
Niels3 -
Mozscape API - Keyword Rankings?
Hi, I'm using the free access to the Mozscape API and while I'm still a novice about what data the Mozscape API can pull I can't seem to figure how to pull the keyword rankings for my campaigns. I reviewed https://moz.com/help/guides/moz-api/mozscape/api-reference/url-metrics but can't seem to find what value I need to call. Can someone guide me in the right direction or is this something only the paid version has access to?
API | | FPK0 -
Why did the April Index Raise DA?
All of our websites DA raised dramatically, including the competitors we track Any idea why this may have happened across the board?
API | | Blue_Compass0 -
March 2nd Mozscape Index Update is Live!
We are excited to announce that our March 2<sup>nd</sup> Index Update is complete and it is looking great! We grew the number of subdomains and root domains indexed, and our correlations are looking solid across the board. Run, don’t walk, to your nearest computer and check out the sweet new data! Here is a look at the finer details: 141,626,596,068 (141 billion) URLs 1,685,594,701 (1 billion) subdomains 193,444,117 (193 million) root domains 1,124,641,982,250 (1.1 Trillion) links Followed vs nofollowed links 3.09% of all links found were nofollowed 62.41% of nofollowed links are internal 37.59% are external Rel canonical: 27.46% of all pages employ the rel=canonical tag The average page has 92 links on it 74 internal links on average 18 external links on average Thanks again! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand:https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores
API | | IanWatson7 -
January’s Mozscape Index Release Date has Been Pushed Back to Jan. 29th
With a new year brings new challenges. Unfortunately for all of us, one of those challenges manifested itself as a hardware issue within one of the Mozscape disc drives. Our team’s attempts to recover the data from the faulty drive only lead to finding corrupted files within the Index. Due to this issue we had to push the January Mozscape Index release date back to the 29<sup>th</sup>. This is not at all how we anticipated starting 2016, however hardware failures like this are an occasional reality and are also not something we see being a repeated hurdle moving forward. Our Big Data team has the new index processing and everything is looking great for the January 29<sup>th</sup> update. We never enjoy delivering bad news to our faithful community and are doing everything in our power to lessen these occurrences. Reach out with any questions or concerns.
API | | IanWatson2 -
API - Row Limit Question
Hi, I'm new to using Moz, and have just got a "Low volume" API account set up. My question is, because ive not yet reached my maximum "Rows per month" limit, what behaviour happens when i reach it? Do i get an error code, if so what, and whats the status code. If not, does my account keep downloading the rows and i get charged extra (in accordance with the cost of the additional rows)? Or is the whole additional rows think like a bolt on? Basically i want to make sure i dont get charged extra each month, and i need the status code returned to handled this in my app. I couldnt see anything explicit in the documents. Cheers
API | | MattHopkins0 -
September's Mozscape Update Broke; We're Building a New Index
Hey gang, I hate to write to you all again with more bad news, but such is life. Our big data team produced an index this week but, upon analysis, found that our crawlers had encountered a massive number of non-200 URLs, which meant this index was not only smaller, but also weirdly biased. PA and DA scores were way off, coverage of the right URLs went haywire, and our metrics that we use to gauge quality told us this index simply was not good enough to launch. Thus, we're in the process of rebuilding an index as fast as possible, but this takes, at minimum 19-20 days, and may take as long as 30 days. This sucks. There's no excuse. We need to do better and we owe all of you and all of the folks who use Mozscape better, more reliable updates. I'm embarassed and so is the team. We all want to deliver the best product, but continue to find problems we didn't account for, and have to go back and build systems in our software to look for them. In the spirit of transparency (not as an excuse), the problem appears to be a large number of new subdomains that found their way into our crawlers and exposed us to issues fetching robots.txt files that timed out and stalled our crawlers. In addition, some new portions of the link graph we crawled exposed us to websites/pages that we need to find ways to exclude, as these abuse our metrics for prioritizing crawls (aka PageRank, much like Google, but they're obviously much more sophisticated and experienced with this) and bias us to junky stuff which keeps us from getting to the good stuff we need. We have dozens of ideas to fix this, and we've managed to fix problems like this in the past (prior issues like .cn domains overwhelming our index, link wheels and webspam holes, etc plagued us and have been addressed, but every couple indices it seems we face a new challenge like this). Our biggest issue is one of monitoring and processing times. We don't see what's in a web index until it's finished processing, which means we don't know if we're building a good index until it's done. It's a lot of work to re-build the processing system so there can be visibility at checkpoints, but that appears to be necessary right now. Unfortunately, it takes time away from building the new, realtime version of our index (which is what we really want to finish and launch!). Such is the frustration of trying to tweak an old system while simultaneously working on a new, better one. Tradeoffs have to be made. For now, we're prioritizing fixing the old Mozscape system, getting a new index out as soon as possible, and then working to improve visibility and our crawl rules. I'm happy to answer any and all questions, and you have my deep, regretful apologies for once again letting you down. We will continue to do everything in our power to improve and fix these ongoing problems.
API | | randfish11 -
In lue of the canceled Moz Index update
Hey Moz, Overall we love your product and are using it daily to help us grow, part of that has been to rely on the Moz Index for DA and PA as well as places where we are doing positive linking through genuine partnerships and reviews of clients. We were really excited to see any the results for this month as we have been partner linked from lots of high reputation sites and google seems to agree as our rankings are moving up weekly. The question from our marketing team is, since a significant part of Moz will not be available to us this month, will there be any compensation handed out to the paying community. PS: I am an engineer and I know how you have probably lost a very large set of data which cant simply be re-crawled over night but Moz Pro is not a cheap product and we do expect it to work. Source: https://moz.com/products/api/updates Kind Regards.
API | | SundownerRV0