10/14 Mozscape Index Update Details
-
Howdy gang,
As you might have seen, we've finally been able to update the Mozscape index after many challenging technical problems in the last 40 days. However, this index has some unique qualities (most of them not ideal) that I should describe.
First, this index still contains data crawled up to 100 days ago. We try to make sure that what we've crawled recently is stuff that we believe has been updated/changed, but there may be sites and pages that have changed significantly in that period that we didn't update (due to issues I've described here previously with our crawlers & schedulers).
Second, many PA/DA and other metric scores will look very similar to the last index because we lost and had problems with some metrics in processing (and believe that much of what we calculated may have been erroneous). We're using metrics from the prior index (which had good correlations with Google, etc) until we can feel confident that the new ones we're calculating are correct. That should be finished by the next index, which, also, should be out much faster than this one (more on that below). Long story short on this one - if your link counts went up and you're seeing much better/new links pointing to you, but DA/PA remain unchanged, don't panic - that's due to problems on our end with calculations and will be remedied in the next index.
Third - the good news is that we've found and fixed a vast array of issues (many of them hiding behind false problems we thought we had), and we now believe we'll be able to ship the next index with greater quality, greater speed, and better coverage. One thing we're now doing is taking every URL we've ever seen in Google's SERPs (via all our rank tracking, SERPscape, the corpus for the upcoming KW Explorer product, etc) and prioritizing them in Mozscape's crawl, so we expect to be matching what Google sees a bit more closely in future indices.
My apologies for the delay in getting this post up - I was on a plane to London for Searchlove - should have got it up before I left.
-
Thank you Jennita and Rand for your quick responses.
Great, lets keep our fingers crossed all goes well and I'm confident the Moz team can deliver it.
We all take a ride on the ebb and flow roller coaster from time to time, its what makes us learn more and overcome challenges.
Have a great day
Cheers,
Joseph -
Hi Joseph - yes, I can answer that. We took ~14 days to process this latest index, which is very good news. However, we are having some trouble with the uploading process again - our technical operations team is working with the big data team to try and uncover the source of these problems. If we can get it fixed and working (in the past, the upload step took ~12 hours, now it's taking us 3-4 days), we should have much more regular index releases.
Right now, we are feeling confident about Nov. 17th, and once we complete the upload we'll have a good picture about data quality and whether we might be able to release early (which we think is quite possible IF quality looks good and these upload issues get sorted).
-
Hi Joseph! I'm sure Rand will chime in as well, but I know our engineers are currently working on a write-up that explains the future of the index, plus some of the issues of the past. They're trying to get all the details in there, and hopefully we can get that published by early next week. What I know so far, is that they've fixed some issues and this index is looking much better. I'll let the engineers explain what that means though.
-
Hi Rand,
I hope you're well and life is good.
I was wondering if you can shed some light on the upcoming OSE update scheduled for the 17th of November.
In an earlier post you said "The good news is that we pad every estimate by nearly 2X. In a normal, problem-free index cycle, we can get it done in 12-14 days."
This would indicate you have potentially / already run an update and reviewing the data to ensure it it correct and relevant to the masses, before making a general release.
Can you advise if the trial update has been run, a success or not, if not do you think you'll have the issue solved for the 17th of November 2015?
I'm very eager to report back to my clients with credible insight using the data you provide.
Cheers,
Joseph Gourvenec -
That doesn't surprise me - Majestic has a larger index than Moz (theirs is actually the largest among active 3rd party indices, then Ahrefs, then us).
https://moz.com/blog/big-data-big-problems-link-indexes-compared this is a pretty good resource comparing the strengths and weaknesses of the various indices, and https://builtvisible.com/comparing-link-data-tools/ is also a good, third-party review of the three. There are strengths and weaknesses to each, but if raw link coverage is your goal, I recommend Majestic.
-
Rand, I've sent info and s/s to Kevin (at Moz) in an attempt to find some commonality between GSC, MOZ and HubSpot. 3rd party tools are showing more links that OSE, and are more in line with GSC. For example, Majestic shows 5x the linking domains that OSE does (on the root, not the www). Kevin points at this thread and cites the present OSE data. I'm trying to figure out why redirected/GWT 'moved' domains don't add up links, or even if they are supposed to? I suspect he's powerless. Who/what can I trust?
-
There are a variety of reasons that include:
- This index is somewhat smaller in total links crawled and URLs included
- We may have biased the crawlers towards sites/pages that are less likely to feature links from your site (this is particularly possible if the linking sites were on Chinese, Palau, or several other TLD extensions that we had previously over-indexed)
- The links we previously crawled may have been on relatively low MozRank pages that this index didn't crawl b/c we found fewer links to them (and thus lower MozRank - we tend to crawl in roughly descending MozRank order across the web).
As noted, the next index should see better coverage, fresher data, and better metrics, too. Please let us know if the problem persists and maybe we can compare your WM Tools links vs. our index to see what could be happening. Thanks and apologies.
-
Hey There,
Sorry this wasn't called out, but the number of links in this index is quite a bit lower than our previous releases, due to some of the technical issues Rand calls out here and in the September post (https://moz.com/community/q/september-s-mozscape-update-broke-we-re-building-a-new-index). The next release will remedy the issues you are currently seeing, as we have identified and fixed the bugs that caused the issues. Sorry again for any confusion here.
-
Hi Rand,
We have noticed among many of our clients that there has been a drastic drop in inbound links since the update to the Mozscape index. Our DA/PA have stayed the same, but links have dropped approximatey 60-70% across the boards. We have looked into the issue by looking in Google Webmaster Tools and have found that we have quite a few more links than are showing up in Moz. Could you explain why this might be happening?
Thanks!
-
We are both in the same boat there. I, too, desperately need this next index to get us on track and provide excellent value. If not, I think we're going to lose a lot of customers, and I'm not sure people will trust us for a long time on link data.
You have my deep and sincere apologies for the frustration and professional challenge Moz has caused. We have an obligation to do better, and I damn sure hope the team is up to delivering on that obligation.
-
I have read the comments and the frustrations associated with the recent issues and would like to suggest that with any software such as this there will be glitches from time to time. We currently use roughly 30 to 40 SaaS providers of many different types and I cannot think of one that hasn't had an issue at one time or another. Having been with Moz for over 5 years I will say that the issues are few and the response is always transparent with frequent updates. (I cannot say that for most other providers).
I would suggest to anyone who is doing client work that as soon as you can afford to have more than one service provider you do so. There are two basic reasons: if there is an issue you always have back up and second, you get the benefit of being able to compare data. Personally, I find this invaluable for client work. I do not feel disloyal to Moz, I just know that every piece of software has its own limitations.
Good luck to all with the current travails.
-
Hi Rand
Firstly I have considered MOZ to be the best SEO information gathering tool of its kind for some years now; and whenever I have taken on a new client or role have recommended to those companies not already using it, to set up an account with MOZ.
I started a new digital marketing role in July which comes with six month KPIs. I've been given a budget to improve SERPs & external inbound links; and right now I'm feeling pretty frustrated and embarrassed as, similar to Joseph I feel the lack of quality data from MOZ over the past few months has left me with egg on my face and having to prove my worth.
I need the next index, which is suggested to be 14 November, to feel me with confidence and my reports with valid data or I will be forced to look elsewhere. I hope it doesn't come to that.
-
Hi Joseph - you'll get no arguments from me on any of these fronts. I think if you've been using Moz exclusively or primarily for the link data component, you should request a refund by emailing help@moz.com (they'll be happy to provide one). Totally concur that our service the past 60 days on the link data front has not been acceptable.
-
Hi Team Moz,
I have a lot of respect for Team Moz and all the efforts behind the senses you must do to deliver a what has been a great service.
The only issue I have now, is that I've paid for two months subscription and still nothing new to report to my clients and it is:
1, Making me look foolish "like i have the wrong provider"
2. You look not so credible "because I tell clients where we get our data from and explained last month, the last time an index update didn't happen in full was years ago"
3. Me out of pocket
a. From subscription
b. Staffing for data analysts to ensure we maximise the new data gained from Moz each "4 to 6 weeks" update.At the end of the day we pay a fee for the data in OSE expecting a reasonable level information to be delivered for the fee we pay.
Rand sometimes honesty is good but "Then things took a turn for the worse and we've been struggling ever since." and your other comments isn't filling me with confidence that the next OSE release is going to be an improvement on the last.
As much as I respect you, the Moz team, the service, historic data, past efforts and overall community. I pay my subscription for credible, relevant and up to data data from OSE to support my digital activities and strategies which I've not received for the last two months.
I'm sure things will improve because up is the only way from here I feel and it needs to be too, you can't expect customers to continue to pay for the core service Moz is known for and offer; that the SEO community rely on to perform and deliver on their client expectations.
I purposely don't use other suppliers for this type of data from starting out in SEO to this day running my own company, because I feel a certain amount of loyalty to Moz just as Donna does and the rest of the community most likely do. But there does become a point when options need to be revised.
I hope this message hasn't come across in a disrespectful manner to you or the team at Moz, I just want the best data and to deliver on the expectations I have set my clients; based on that I tell them we only use the best data provider and market leader in the world which is your company.
I look forward to seeing an improvement on the next update.
Cheers,
Joseph -
Hi Donna - you are most certainly not alone in your frustration. I would call my own feelings bordering on desperation. I'm frustrated, angry, nervous, guilty, and overwhelmed with a sense of powerlessness. It seems that every time we think we've identified a problem at the root of our Mozscape issues, things just get worse and new problems we never imagined arise.
On the padding issue, I have good news and depressing news. The good news is that we pad every estimate by nearly 2X. In a normal, problem-free index cycle, we can get it done in 12-14 days.... And yet, we never estimate less than 30-31 days for an index release. In the early part of this year, you might recall that we had a number of indices released back to back in that 2-3 week window. Then things took a turn for the worse and we've been struggling ever since.
I want to be honest - my belief is that we are going to get better, but the evidence of the last 6 months is against me. I want to believe my team and I know they are trying hard and doing everything they can to get this fixed. However, I think it's wise to have skepticism given the trajectory of the recent past.
Hope that's helpful and thank you for the comment.
-
Hi Donna,
I think it's fair to feel that way, it's definitely frustrating on all ends. While we do try hard to be as open and upfront as we can with information, we can most definitely work harder on getting it out sooner. I also very much appreciate the kind words about the community, and that you took the time to leave your thoughts. I'm sure many folks feel the same way. I'll let Rand (or others) jump in and respond as well, but I wanted to say thanks!
Jen
-
Before I comment, I want to say I am a loyal member of and contributor to the Moz community. In my opinion, it's unprecedented in it's openness, honesty and willingness to help others.
But I also want to express frustration cause I think I'm probably not alone in feeling it. While I understand and appreciate there are issues and Moz is doing everything it possibly can to address them, updates keep slipping and notifications are only given after the fact. We only know if there's going to be an update delay when the deadline passes and there's no change to API data.
Remember Scotty on the original Star Trek series? Chief engineer Montgomery "Scotty" Scott had a reputation for being a miracle worker because he routinely padded his estimates. I see the next updated scheduled for November 14 and am worried.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Regarding Moz API token password update
Hi, In March we have updated password for MOZ API and used in our application it worked, but currently the updated password is not working and in the MOZ site the old password is shown and its active. We are using Legacy username and password.
API | | NickAndrews
We see that 5 tokens can be added for API, if we add 2 tokens both will be active.
We are currently using free services. Please help us resolve this issue.0 -
Sitemaps and Indexed Pages
Hi guys, I created an XML sitemap and submitted it for my client last month. Now the developer of the site has also been messing around with a few things. I've noticed on my Moz site crawl that indexed pages have dropped significantly. Before I put my foot in it, I need to figure out if submitting the sitemap has caused this.. can a sitemap reduce the pages indexed? Thanks David. TInSM
API | | Slumberjac0 -
First Mozscape index of the year is live
I'm happy to announce, the first index of the year is out. We did have a smaller count of subdomains, but correlations are generally up and coverage of what's in Google looks better, too. We're giving that one a high five! We've (hopefully) removed a lot of foreign and spam subdomains, which you might see reflected in your spam links section. (another woot!) Here are some details about this index release: 145,549,223,632 (145 billion) URLs 1,356,731,650 (1 billion) subdomains 200,255,095 (200 million) root domains 1,165,625,349,576 (1.1 Trillion) links Followed vs nofollowed links 3.17% of all links found were nofollowed 63.49% of nofollowed links are internal 36.51% are external Rel canonical: 26.50% of all pages employ the rel=canonical tag The average page has 89 links on it 72 internal links on average 17 external links on average Thanks! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand: https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores.
API | | jennita5 -
September's Mozscape Update Broke; We're Building a New Index
Hey gang, I hate to write to you all again with more bad news, but such is life. Our big data team produced an index this week but, upon analysis, found that our crawlers had encountered a massive number of non-200 URLs, which meant this index was not only smaller, but also weirdly biased. PA and DA scores were way off, coverage of the right URLs went haywire, and our metrics that we use to gauge quality told us this index simply was not good enough to launch. Thus, we're in the process of rebuilding an index as fast as possible, but this takes, at minimum 19-20 days, and may take as long as 30 days. This sucks. There's no excuse. We need to do better and we owe all of you and all of the folks who use Mozscape better, more reliable updates. I'm embarassed and so is the team. We all want to deliver the best product, but continue to find problems we didn't account for, and have to go back and build systems in our software to look for them. In the spirit of transparency (not as an excuse), the problem appears to be a large number of new subdomains that found their way into our crawlers and exposed us to issues fetching robots.txt files that timed out and stalled our crawlers. In addition, some new portions of the link graph we crawled exposed us to websites/pages that we need to find ways to exclude, as these abuse our metrics for prioritizing crawls (aka PageRank, much like Google, but they're obviously much more sophisticated and experienced with this) and bias us to junky stuff which keeps us from getting to the good stuff we need. We have dozens of ideas to fix this, and we've managed to fix problems like this in the past (prior issues like .cn domains overwhelming our index, link wheels and webspam holes, etc plagued us and have been addressed, but every couple indices it seems we face a new challenge like this). Our biggest issue is one of monitoring and processing times. We don't see what's in a web index until it's finished processing, which means we don't know if we're building a good index until it's done. It's a lot of work to re-build the processing system so there can be visibility at checkpoints, but that appears to be necessary right now. Unfortunately, it takes time away from building the new, realtime version of our index (which is what we really want to finish and launch!). Such is the frustration of trying to tweak an old system while simultaneously working on a new, better one. Tradeoffs have to be made. For now, we're prioritizing fixing the old Mozscape system, getting a new index out as soon as possible, and then working to improve visibility and our crawl rules. I'm happy to answer any and all questions, and you have my deep, regretful apologies for once again letting you down. We will continue to do everything in our power to improve and fix these ongoing problems.
API | | randfish11 -
Lost many links and keyword ranks since moz index update
Hi All, I came back from work today from a week off to find my site has gone from 681 external inbound links to 202. With this my domain authority, moz trust and moz rank have all also taken a slip. Compounding this, I am seeing a slip most of my keywords rankings. If i try to use the open site explorer to explore my links and see what going on i get the message It looks like we haven't discovered link data for this site or URL. If i check the just discovered links like it suggests I get It looks like there's no Just-Discovered Links data for this URL yet. I know these features worked before the index as i used them. Is this all attributable to the moz index issues that have been noted or could something have happened to my site? Since i started 2 months ago I have made many changes including... Updating the site map that was 4 years out of date and included 400 broken urls Removed blank pages and other useless webpages on the site that contained no content (from the previous administrator) Edited a few pages content from keyword spammy stuff to nicely written and relevant content Fixed url rewrites that made loops and un-accessible product pages All these changes should be for the better but the latest readings have me a little worried. Thanks.
API | | ATP0 -
Have Questions about the Jan. 27th Mozscape Index Update? Get Answers Here!
Howdy y'all. I wanted to give a brief update (not quite worthy of a blog post, but more than would fit in a tweet) about the latest Mozscape index update. On January 27th, we released our largest web index ever, with 285 Billion unique URLs, and 1.25 Trillion links. Our previous index was also a record at 217 Billion pages, but this one is another 30% bigger. That's all good news - it means more links that you're seeking are likely to be in this index, and link counts, on average, will go up. There are two oddities about this index, however, that I should share: The first is that we broke one particular view of data - 301'ing links sorted by Page Authority doesn't work in this index, so we've defaulted to sorting 301s by Domain Authority. That should be fixed in the next index, and from our analytics, doesn't appear to be a hugely popular view, so it shouldn't affect many folks (you can always export to CSV and re-sort by PA in Excel if you need, too - note that if you have more than 10K links, OSE will only export the first 10K, so if you need more data, check out the API). The second is that we crawled a massively more diverse set of root domains than ever before. Whereas our previous index topped out at 192 million root domains, this latest one has 362 million (almost 1.9X as many unique, new domains we haven't crawled before). This means that DA and PA scores may fluctuate more than usual, as link diversity are big parts of those calculations and we've crawled a much larger swath of the deep, dark corners of the web (and non-US/non-.com domains, too). It also means that, for many of the big, more important sites on the web, we are crawling a little less deeply than we have in the past (the index grew by ~31% while the root domains grew by ~88%). Often, those deep pages on large sites do more internal than external linking, so this might not have a big impact, but it could depend on your field/niche and where your links come from. As always, my best suggestion is to make sure to compare your link data against your competition - that's a great way to see how relative changes are occurring and whether, generally speaking, you're losing or gaining ground in your field. If you have specific questions, feel free to leave them and I'll do my best to answer in a timely fashion. Thanks much! p.s. You can always find information about our index updates here.
API | | randfish8 -
Suggestion - Should OSE include "citation links" within its index?
This is really a suggestion (and debate to see if people agree with me), with regard to including "citation links" within Moz tools, by default, as just another type of link NOTE: when I am talking about "citation links" I am talking about a link that is not wrapped in a link tag and is therefore non clickable, eg moz.com Obviously Moz have released the mentions tool, which is great, and also FWE which is also great. However, it would seem to me that they are missing a trick in that "citation links" don't feature in the main link index at all. We know that Google as a minimum uses them as an indicator to crawl a page ( http://ignitevisibility.com/google-confirms-url-citations-can-help-pages-get-indexed/ ), and also that they don't pass page rank - HOWEVER, you would assume that google does use then as part of their alogrithm in some manner as they do nofollow links. It would seem to me that a "Citation Link" could (possibly) be deemed more important than a no follow link in Googles alogrithm, as a "no follow" link is a clear indication by the site owner that they don't fully trust the link, but a citation link would neither indicate trust or non trust. So - my request is to get "citation links" into the main link index (and the Just Discovered index for that matter). Would others agree??
API | | James770 -
Can any of the MOZ APIs give me the top 10 google results for a keyword?
I'd like to feed a list of keywords into the Moz API and get back a list of the websites that appear on the first page for that keyword. Can anyone tell me if this is possible?
API | | amchapel0