Have Questions about the Jan. 27th Mozscape Index Update? Get Answers Here!
-
Howdy y'all. I wanted to give a brief update (not quite worthy of a blog post, but more than would fit in a tweet) about the latest Mozscape index update.
On January 27th, we released our largest web index ever, with 285 Billion unique URLs, and 1.25 Trillion links. Our previous index was also a record at 217 Billion pages, but this one is another 30% bigger. That's all good news - it means more links that you're seeking are likely to be in this index, and link counts, on average, will go up.
There are two oddities about this index, however, that I should share:
The first is that we broke one particular view of data - 301'ing links sorted by Page Authority doesn't work in this index, so we've defaulted to sorting 301s by Domain Authority. That should be fixed in the next index, and from our analytics, doesn't appear to be a hugely popular view, so it shouldn't affect many folks (you can always export to CSV and re-sort by PA in Excel if you need, too - note that if you have more than 10K links, OSE will only export the first 10K, so if you need more data, check out the API).
The second is that we crawled a massively more diverse set of root domains than ever before. Whereas our previous index topped out at 192 million root domains, this latest one has 362 million (almost 1.9X as many unique, new domains we haven't crawled before). This means that DA and PA scores may fluctuate more than usual, as link diversity are big parts of those calculations and we've crawled a much larger swath of the deep, dark corners of the web (and non-US/non-.com domains, too). It also means that, for many of the big, more important sites on the web, we are crawling a little less deeply than we have in the past (the index grew by ~31% while the root domains grew by ~88%). Often, those deep pages on large sites do more internal than external linking, so this might not have a big impact, but it could depend on your field/niche and where your links come from.
As always, my best suggestion is to make sure to compare your link data against your competition - that's a great way to see how relative changes are occurring and whether, generally speaking, you're losing or gaining ground in your field.
If you have specific questions, feel free to leave them and I'll do my best to answer in a timely fashion. Thanks much!
p.s. You can always find information about our index updates here.
-
Thanks Matt I'm proud of the team's work on growing the index thus far. I think we've reached the top of where we can go with the current index's infrastructure, so I'd expect sizes will stay in this range for the next 5-6 updates at least.
For the last 4 years, we have been working on a new infrastructure for our indices - something closer to what Google does with real-time processing via caffeine (though not quite as robust), and we're planning to launch that in Q4 of this year, at which time, our index can grow much bigger and much faster (it'll also be fresher, included lots more kinds of data, etc). That system also won't be limited by software (which holds us back today), but rather by hardware (which we can and will buy more of). I really can't wait for that
-
I second that opinion, super exciting to get deeper information! Can't wait to dive in!
-
I don't have a specific question, just a WOW! I remember when the index was getting smaller & smaller as you guys went through some "figuring out" of how exactly you would index the whole internets. It has come back in SUCH a big way!
I was thinking of your "false narratives" /rand blog post and how things didn't always go the way you wanted. OSE's limits have always been one of those "not the way I wanted it to be" with Moz and this size of update is an AMAZING comeback.
So, no question - just a bit of a "great job, team!" to get the index to this size. Can't wait for EVEN MORE.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Mozscape API subscription
We have questions regarding our subscription and the plan we are on. We are more interested in Mozscape API and not the features that we have access to currently. Will you let us know how we can change. Is there some one that we can chat with. Thanks,
API | | PatientPop
Naveen
naveen.sarabu@patientpop.com0 -
How can I get "Date First Seen","Date Last Seen" and "Date Lost" from the API?
"Date First Seen","Date Last Seen" and "Date Lost" are columns in the CSV exported from LinkExplorer's Inbound Links page. How do I get that data from the API?
API | | StevePoul1 -
Mozscape API Updates (Non-updates!) - becoming a joke!
This is the 3rd month in succession where the Mozscape index has been delayed. Myself and clients are losing patience with this, as I am sure many others must be. Just what do you suppose we tell clients waiting for that data? We have incomplete and sometimes skewed metrics to report on, delays which then get delayed further, with nothing but the usual 'we are working on it' and 'bear with us'. It's becoming obvious you fudged the index update back in January (see discussion here with some kind of explanation finally from Rand: https://moz.com/community/q/is-everybody-seeing-da-pa-drops-after-last-moz-api-update), and seems you have been fumbling around ever since trying to fix it, with data all over the place, shifting DA scores and missing links from campaign data. Your developers should be working around the clock to fix this, because this is a big part of what you're selling in your service, and as SEO's and marketers we are relying on that data for client retention and satisfaction. Will you refund us all if we should lose clients over this?! .. I don't think so! With reports already sent out the beginning of the month with incomplete data, I told clients the index would refresh April 10th as informed from the API updates page, only to see it fudged again on day of release with the index being rolled back to previous. So again, I have to tell clients there will be more delays, ...with the uncertainty of IF it WILL EVEN get refreshed when you say it will. It's becoming a joke.. really!
API | | GregDixson2 -
Is there an easy way to get MOZ data into Google Sheets?
Is there any way to pull data directly from MOZ to Google sheets, like a plug-in for sheets?
API | | Storfiskaren2 -
/index.php causing a few issues
Hey Mozzers, Our site uses magento. Pages within the site (not categories or products) are set to display as www.domain.co.uk/page-url/ The hta access is set to redirect all version such as www.domain.co.uk/page-url to a url ending in a / However in google analytics and in moz landing page tracker these urls are being represented by www.domain.co.uk/page-url/index.php When visiting www.domain.co.uk/page-url/index.php a 404 is displayed. I know that by default when directed to a directory it automatically finds and displays the index file. So i understand why this is happening to some degree. However, when manually visiting this link does not exist. This poses a problem when trying to view the landing pages information in moz pro. I have 20 keywords being tracked in relation to www.domain.co.uk/page-url/ but because moz is recording it as www.domain.co.uk/page-url/index.php the keywords are unrelated so not showing information in relation to the page. Any ideas?
API | | ATP0 -
Spring is here and so is our May Index Update!
Happy Index Release Day! For the second month in a row, our hard-working, supremely dedicated Big Data team has delivered our Index Update EARLY! Beyond being punctual, the May Index is one of our most comprehensive and largest update of the year for Moz. Let’s dig into the details: 162,225,495,455 (162 billion) URLs. 1,135,327,420 (1.1 billion) subdomains. 194,346,505 (194 million) root domains. 1,168,465,575,815 (1.1 Trillion) links. Followed vs nofollowed links 2.84% of all links found were nofollowed 65.80% of nofollowed links are internal 34.20% are external Rel canonical: 28.89% of all pages employ the rel=canonical tag The average page has 92 links on it 76 internal links on average. 16 external links on average.. Go have fun with your new data! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand: https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores
API | | IanWatson5 -
Cannot get the API to work when using an EC2 server
Hi I've created a script that I'd like to use to check a list of domains using the Moz API. It works totally fine on my local machine. However, when I run it from my EC2 instance, it fails every time. To be specific, the response is always empty (the response is an empty json array) when the request is sent from EC2. Is, for some reason, EC2 blocked by the Moz API? Many thanks for your help, Andrew
API | | csandrew0