August 3rd Mozscape Index Update (our largest index, but nearly a monthly late)
-
Update 5:27pm 8/4 - the data in Open Site Explorer is up-to-date, as is the API and Mozbar. Moz Analytics campaigns are currently loading in the new data, and all campaigns should be fully up-to-date by 4-10pm tomorrow (8/5). However, your campaign may have the new data much earlier as it depends on where that campaign falls in the update ordering.
Hey gang,
I wanted to provide some transparency into the latest index update, as well as give some information about our plans going forward with future indices.
The Good News: This index, now that it's delivered, is pretty impressive.
- Mozscape's August index is 407 Billion URLs in size, nearly 100 Billion (~25%) bigger than our last record index size. We indexed 2.18 trillion links for the first time ever (prior record was 1.54 trillion).
- Correlations for Page Authority have gone up from 0.319 to 0.333 in the latest index, suggesting that we're getting a slightly more accurate representation of Google's use of links in rankings from this data (DA correlations remain constant at 0.185)
- Our hit ratio for URLs in Google's SERPs has gone up considerably, from 69.97% in our previous index to 78.66% in the August update. This indicates we are crawling and indexing more of what Google shows in the search results (a good benchmark for us). Note that a large portion of what's missing will be things published in the last 30-60 days while we were processing the index (after crawling had stopped).
The Bad News: August's index was late by ~25 days.
We know that reliable, consistent, on-time Mozscape updates are critically important to everyone who uses Moz's products. We've been working hard for years to get these to a better place, but have struggled mightily. Our latest string of failures was completely new to the team - a bunch of problems and issues we've never seen before (some due to the index size, but many due to odd things like a massive group of what appear to be spam domains using the Palau TLD extension clogging up crawl/processing, large chunks of pages we crawled with 10s of thousands of links which slow down the MozRank calculations, etc). While there's no excuse for delays, and we don't want to pass these off as such, we do want to be transparent about why we were so late.
Our future plans include scaling back the index sizes a bit, dealing with the issues around spam domains, large link-list pages, some of the odd patterns we see in .pl and .cn domains, and taking one extra person from the Big Data team off of work on the new index system (which will be much larger and real-time rather than updated every 30 days) to help with Mozscape indices. We believe these efforts, and the new monitoring systems we've got will help us get better at producing high quality, consistent indices.
Question everyone always asks: Why did my PA/DA change?!
There are tons of reasons why these can change, and they don't necessarily mean anything bad about your site, your SEO efforts, or whether your links are helping you rank. PA and DA are predictive, correlated metrics that say nothing about how you're actually performing. They merely map better than most metrics to Google's global rankings across large SERP sets (but not necessarily your SERPs, which is what you should care about).
That said, here's some of the reasons PA/DA do shift:
- The domains/pages with the highest PA/DA scores gain even faster than most of the domains below them, making it harder each index to get higher scores (since PA/DA are on a logarithmic scale, this is smoothed out somewhat - it would be much worse on a conventional scale, e.g. Facebook.com 100, everyone else 0.0003).
- Google's ranking algorithm introduces new elements, changes, modifies what they care about, etc.
- Moz crawls a set of the web that does or doesn't include the pages that are more likely to point to a given domain than another. Although our crawl tends to be representative, if you've got lots of links from deep pages on less popular domains in a part of the web far from the mainstream, we may not consistently crawl those well (or, we could overcrawl your sector because it recently received powerful links from the center of the web).
My advice, as always, is to use PA/DA as relative scores. If your scores are falling, but your competitors' are falling more, that's not a bad thing. If your scores are rising, but your competitors' are rising faster, they're probably gaining ground on you. And, if you're talking about score changes in the 1-4 points range, that's not necessarily anything but noise. PA/DA scores often shift 1-4 points up or down in a new index so don't sweat it!
Let me know if you've got more questions and I'll do my best to answer. You can also refer to the API update page here: https://moz.com/products/api/updates
-
Rand, I've emailed you. Thx
-
Where are you seeing that? In OSE? Or in Moz Analytics? In Moz Analytics, it's possible that it's still cached, and will be updating (a few thousand campaigns each hour, so not too long until all of them are done), but in OSE, that data should absolutely be new. If not, can you send an email to me - rand at moz dot com - with your sites, and I'll ask the Big Data team to look into it.
-
Hi Rand, I'm still seeing 9 June in my campaigns and no updated data....or missing data. Not fixed here yet.
-
Yup - I'm seeing the same team. Have let our engineers know - hopefully they can sort it out and fix soon.
-
Rand, I'm seeing some seriously weird data on many of our sites. Crazy Euro links that go nowhere...that definitely aren't meant to be there, and link totals that don't add up.
-
I'm seeing some odd ones too that appear not to have updated. Pinging the team as it shouldn't usually take this long for data to update.
-
Update:
some of the sites we are tracking have data in them but it's still from 9 June. The rest are showing incorrect / corrupted links or no links at all.
Conclusion: there is something seriously wrong with the MozScape update for us.
-
Hey Sticky! It takes about 24-48 hours for new index information to be submitted to Moz Campaigns anytime a new Mozscape Index is released. By checking your domain directly on OSE (moz.com/researchtools/ose) you will be able to see your data—and more—before campaigns are updated. This may be slightly delayed as we are building monthly data for all campaigns which we run on the 1st of each month. Generally our index updates are rarely released near the beginning of the month which would not interfere with normal campaign updates.
Hope this helps and let me know if you have any questions!
-
It is working on most sites, but a few I have just checked have changed, ie one started at 27 - 5 hours ago was 32 now 30! So might give it another 24 hours to settle down.
-
Glad it is working for you. I'm still seeing last Index, and in some cases no data.
-
I had to re-fresh a few pages, a few times, but all the data has come though now. Every website up, though a few by only 4, but I am still hopeful that is not noise but the result of hard work.
-
I'm wondering why I can't see the updated MozScape data in my account? It still says next index 9 June and the data still appears to be old (and / or incomplete). Any advice?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Regarding Moz API token password update
Hi, In March we have updated password for MOZ API and used in our application it worked, but currently the updated password is not working and in the MOZ site the old password is shown and its active. We are using Legacy username and password.
API | | NickAndrews
We see that 5 tokens can be added for API, if we add 2 tokens both will be active.
We are currently using free services. Please help us resolve this issue.0 -
Why the Feb 2018 update was so early?
Hi There! We are using Moz to compare our metrics to increase our SEO / SERP penetration. According to MOZ API Updates, it was mentioned that the next update will be on 26th Feb. But the update was early, could you please let us know the reason for the same. Why is there a discrepancy between the date mentioned for the Moz Update and the date of release? Thanks Malik Zakaria
API | | mzakaria0 -
Website domain authority dropped from 55 to 1 in two months
Hi all! Our website DA dropped from 55 to 1 in a matter of two months. Last time I checked, it was two months ago and the website DA was 55. When I checked today, it is 1. Needless to day, I am shocked. The website has been around since 1995 and has always had pretty decent DA and PA. We made a couple of important changes during past 3-4 months. One is to support HTTPS and another is to implement the responsive web design to support browsing by mobile devices. What could be the reason for such a dramatic drop of DA in a short period of time? Where would you suggest me to look for it? How can we get the problem fixed? Here is the website URL: www.infohub.com Thank you!
API | | jimz80701 -
Sitemaps and Indexed Pages
Hi guys, I created an XML sitemap and submitted it for my client last month. Now the developer of the site has also been messing around with a few things. I've noticed on my Moz site crawl that indexed pages have dropped significantly. Before I put my foot in it, I need to figure out if submitting the sitemap has caused this.. can a sitemap reduce the pages indexed? Thanks David.
API | | Slumberjac0 -
Mozscape API Updates (Non-updates!) - becoming a joke!
This is the 3rd month in succession where the Mozscape index has been delayed. Myself and clients are losing patience with this, as I am sure many others must be. Just what do you suppose we tell clients waiting for that data? We have incomplete and sometimes skewed metrics to report on, delays which then get delayed further, with nothing but the usual 'we are working on it' and 'bear with us'. It's becoming obvious you fudged the index update back in January (see discussion here with some kind of explanation finally from Rand: https://moz.com/community/q/is-everybody-seeing-da-pa-drops-after-last-moz-api-update), and seems you have been fumbling around ever since trying to fix it, with data all over the place, shifting DA scores and missing links from campaign data. Your developers should be working around the clock to fix this, because this is a big part of what you're selling in your service, and as SEO's and marketers we are relying on that data for client retention and satisfaction. Will you refund us all if we should lose clients over this?! .. I don't think so! With reports already sent out the beginning of the month with incomplete data, I told clients the index would refresh April 10th as informed from the API updates page, only to see it fudged again on day of release with the index being rolled back to previous. So again, I have to tell clients there will be more delays, ...with the uncertainty of IF it WILL EVEN get refreshed when you say it will. It's becoming a joke.. really!
API | | GregDixson2 -
Spring is here and so is our May Index Update!
Happy Index Release Day! For the second month in a row, our hard-working, supremely dedicated Big Data team has delivered our Index Update EARLY! Beyond being punctual, the May Index is one of our most comprehensive and largest update of the year for Moz. Let’s dig into the details: 162,225,495,455 (162 billion) URLs. 1,135,327,420 (1.1 billion) subdomains. 194,346,505 (194 million) root domains. 1,168,465,575,815 (1.1 Trillion) links. Followed vs nofollowed links 2.84% of all links found were nofollowed 65.80% of nofollowed links are internal 34.20% are external Rel canonical: 28.89% of all pages employ the rel=canonical tag The average page has 92 links on it 76 internal links on average. 16 external links on average.. Go have fun with your new data! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand: https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores
API | | IanWatson5 -
Still not got any index update data.
Is anyone finding that they haven't got the results of the update yet? I have tried some competitors and they are not updated either.
API | | AHC_SEO0 -
Have Questions about the Jan. 27th Mozscape Index Update? Get Answers Here!
Howdy y'all. I wanted to give a brief update (not quite worthy of a blog post, but more than would fit in a tweet) about the latest Mozscape index update. On January 27th, we released our largest web index ever, with 285 Billion unique URLs, and 1.25 Trillion links. Our previous index was also a record at 217 Billion pages, but this one is another 30% bigger. That's all good news - it means more links that you're seeking are likely to be in this index, and link counts, on average, will go up. There are two oddities about this index, however, that I should share: The first is that we broke one particular view of data - 301'ing links sorted by Page Authority doesn't work in this index, so we've defaulted to sorting 301s by Domain Authority. That should be fixed in the next index, and from our analytics, doesn't appear to be a hugely popular view, so it shouldn't affect many folks (you can always export to CSV and re-sort by PA in Excel if you need, too - note that if you have more than 10K links, OSE will only export the first 10K, so if you need more data, check out the API). The second is that we crawled a massively more diverse set of root domains than ever before. Whereas our previous index topped out at 192 million root domains, this latest one has 362 million (almost 1.9X as many unique, new domains we haven't crawled before). This means that DA and PA scores may fluctuate more than usual, as link diversity are big parts of those calculations and we've crawled a much larger swath of the deep, dark corners of the web (and non-US/non-.com domains, too). It also means that, for many of the big, more important sites on the web, we are crawling a little less deeply than we have in the past (the index grew by ~31% while the root domains grew by ~88%). Often, those deep pages on large sites do more internal than external linking, so this might not have a big impact, but it could depend on your field/niche and where your links come from. As always, my best suggestion is to make sure to compare your link data against your competition - that's a great way to see how relative changes are occurring and whether, generally speaking, you're losing or gaining ground in your field. If you have specific questions, feel free to leave them and I'll do my best to answer in a timely fashion. Thanks much! p.s. You can always find information about our index updates here.
API | | randfish8