10/14 Mozscape Index Update Details
-
Howdy gang,
As you might have seen, we've finally been able to update the Mozscape index after many challenging technical problems in the last 40 days. However, this index has some unique qualities (most of them not ideal) that I should describe.
First, this index still contains data crawled up to 100 days ago. We try to make sure that what we've crawled recently is stuff that we believe has been updated/changed, but there may be sites and pages that have changed significantly in that period that we didn't update (due to issues I've described here previously with our crawlers & schedulers).
Second, many PA/DA and other metric scores will look very similar to the last index because we lost and had problems with some metrics in processing (and believe that much of what we calculated may have been erroneous). We're using metrics from the prior index (which had good correlations with Google, etc) until we can feel confident that the new ones we're calculating are correct. That should be finished by the next index, which, also, should be out much faster than this one (more on that below). Long story short on this one - if your link counts went up and you're seeing much better/new links pointing to you, but DA/PA remain unchanged, don't panic - that's due to problems on our end with calculations and will be remedied in the next index.
Third - the good news is that we've found and fixed a vast array of issues (many of them hiding behind false problems we thought we had), and we now believe we'll be able to ship the next index with greater quality, greater speed, and better coverage. One thing we're now doing is taking every URL we've ever seen in Google's SERPs (via all our rank tracking, SERPscape, the corpus for the upcoming KW Explorer product, etc) and prioritizing them in Mozscape's crawl, so we expect to be matching what Google sees a bit more closely in future indices.
My apologies for the delay in getting this post up - I was on a plane to London for Searchlove - should have got it up before I left.
-
Thank you Jennita and Rand for your quick responses.
Great, lets keep our fingers crossed all goes well and I'm confident the Moz team can deliver it.
We all take a ride on the ebb and flow roller coaster from time to time, its what makes us learn more and overcome challenges.
Have a great day
Cheers,
Joseph -
Hi Joseph - yes, I can answer that. We took ~14 days to process this latest index, which is very good news. However, we are having some trouble with the uploading process again - our technical operations team is working with the big data team to try and uncover the source of these problems. If we can get it fixed and working (in the past, the upload step took ~12 hours, now it's taking us 3-4 days), we should have much more regular index releases.
Right now, we are feeling confident about Nov. 17th, and once we complete the upload we'll have a good picture about data quality and whether we might be able to release early (which we think is quite possible IF quality looks good and these upload issues get sorted).
-
Hi Joseph! I'm sure Rand will chime in as well, but I know our engineers are currently working on a write-up that explains the future of the index, plus some of the issues of the past. They're trying to get all the details in there, and hopefully we can get that published by early next week. What I know so far, is that they've fixed some issues and this index is looking much better. I'll let the engineers explain what that means though.
-
Hi Rand,
I hope you're well and life is good.
I was wondering if you can shed some light on the upcoming OSE update scheduled for the 17th of November.
In an earlier post you said "The good news is that we pad every estimate by nearly 2X. In a normal, problem-free index cycle, we can get it done in 12-14 days."
This would indicate you have potentially / already run an update and reviewing the data to ensure it it correct and relevant to the masses, before making a general release.
Can you advise if the trial update has been run, a success or not, if not do you think you'll have the issue solved for the 17th of November 2015?
I'm very eager to report back to my clients with credible insight using the data you provide.
Cheers,
Joseph Gourvenec -
That doesn't surprise me - Majestic has a larger index than Moz (theirs is actually the largest among active 3rd party indices, then Ahrefs, then us).
https://moz.com/blog/big-data-big-problems-link-indexes-compared this is a pretty good resource comparing the strengths and weaknesses of the various indices, and https://builtvisible.com/comparing-link-data-tools/ is also a good, third-party review of the three. There are strengths and weaknesses to each, but if raw link coverage is your goal, I recommend Majestic.
-
Rand, I've sent info and s/s to Kevin (at Moz) in an attempt to find some commonality between GSC, MOZ and HubSpot. 3rd party tools are showing more links that OSE, and are more in line with GSC. For example, Majestic shows 5x the linking domains that OSE does (on the root, not the www). Kevin points at this thread and cites the present OSE data. I'm trying to figure out why redirected/GWT 'moved' domains don't add up links, or even if they are supposed to? I suspect he's powerless. Who/what can I trust?
-
There are a variety of reasons that include:
- This index is somewhat smaller in total links crawled and URLs included
- We may have biased the crawlers towards sites/pages that are less likely to feature links from your site (this is particularly possible if the linking sites were on Chinese, Palau, or several other TLD extensions that we had previously over-indexed)
- The links we previously crawled may have been on relatively low MozRank pages that this index didn't crawl b/c we found fewer links to them (and thus lower MozRank - we tend to crawl in roughly descending MozRank order across the web).
As noted, the next index should see better coverage, fresher data, and better metrics, too. Please let us know if the problem persists and maybe we can compare your WM Tools links vs. our index to see what could be happening. Thanks and apologies.
-
Hey There,
Sorry this wasn't called out, but the number of links in this index is quite a bit lower than our previous releases, due to some of the technical issues Rand calls out here and in the September post (https://moz.com/community/q/september-s-mozscape-update-broke-we-re-building-a-new-index). The next release will remedy the issues you are currently seeing, as we have identified and fixed the bugs that caused the issues. Sorry again for any confusion here.
-
Hi Rand,
We have noticed among many of our clients that there has been a drastic drop in inbound links since the update to the Mozscape index. Our DA/PA have stayed the same, but links have dropped approximatey 60-70% across the boards. We have looked into the issue by looking in Google Webmaster Tools and have found that we have quite a few more links than are showing up in Moz. Could you explain why this might be happening?
Thanks!
-
We are both in the same boat there. I, too, desperately need this next index to get us on track and provide excellent value. If not, I think we're going to lose a lot of customers, and I'm not sure people will trust us for a long time on link data.
You have my deep and sincere apologies for the frustration and professional challenge Moz has caused. We have an obligation to do better, and I damn sure hope the team is up to delivering on that obligation.
-
I have read the comments and the frustrations associated with the recent issues and would like to suggest that with any software such as this there will be glitches from time to time. We currently use roughly 30 to 40 SaaS providers of many different types and I cannot think of one that hasn't had an issue at one time or another. Having been with Moz for over 5 years I will say that the issues are few and the response is always transparent with frequent updates. (I cannot say that for most other providers).
I would suggest to anyone who is doing client work that as soon as you can afford to have more than one service provider you do so. There are two basic reasons: if there is an issue you always have back up and second, you get the benefit of being able to compare data. Personally, I find this invaluable for client work. I do not feel disloyal to Moz, I just know that every piece of software has its own limitations.
Good luck to all with the current travails.
-
Hi Rand
Firstly I have considered MOZ to be the best SEO information gathering tool of its kind for some years now; and whenever I have taken on a new client or role have recommended to those companies not already using it, to set up an account with MOZ.
I started a new digital marketing role in July which comes with six month KPIs. I've been given a budget to improve SERPs & external inbound links; and right now I'm feeling pretty frustrated and embarrassed as, similar to Joseph I feel the lack of quality data from MOZ over the past few months has left me with egg on my face and having to prove my worth.
I need the next index, which is suggested to be 14 November, to feel me with confidence and my reports with valid data or I will be forced to look elsewhere. I hope it doesn't come to that.
-
Hi Joseph - you'll get no arguments from me on any of these fronts. I think if you've been using Moz exclusively or primarily for the link data component, you should request a refund by emailing help@moz.com (they'll be happy to provide one). Totally concur that our service the past 60 days on the link data front has not been acceptable.
-
Hi Team Moz,
I have a lot of respect for Team Moz and all the efforts behind the senses you must do to deliver a what has been a great service.
The only issue I have now, is that I've paid for two months subscription and still nothing new to report to my clients and it is:
1, Making me look foolish "like i have the wrong provider"
2. You look not so credible "because I tell clients where we get our data from and explained last month, the last time an index update didn't happen in full was years ago"
3. Me out of pocket
a. From subscription
b. Staffing for data analysts to ensure we maximise the new data gained from Moz each "4 to 6 weeks" update.At the end of the day we pay a fee for the data in OSE expecting a reasonable level information to be delivered for the fee we pay.
Rand sometimes honesty is good but "Then things took a turn for the worse and we've been struggling ever since." and your other comments isn't filling me with confidence that the next OSE release is going to be an improvement on the last.
As much as I respect you, the Moz team, the service, historic data, past efforts and overall community. I pay my subscription for credible, relevant and up to data data from OSE to support my digital activities and strategies which I've not received for the last two months.
I'm sure things will improve because up is the only way from here I feel and it needs to be too, you can't expect customers to continue to pay for the core service Moz is known for and offer; that the SEO community rely on to perform and deliver on their client expectations.
I purposely don't use other suppliers for this type of data from starting out in SEO to this day running my own company, because I feel a certain amount of loyalty to Moz just as Donna does and the rest of the community most likely do. But there does become a point when options need to be revised.
I hope this message hasn't come across in a disrespectful manner to you or the team at Moz, I just want the best data and to deliver on the expectations I have set my clients; based on that I tell them we only use the best data provider and market leader in the world which is your company.
I look forward to seeing an improvement on the next update.
Cheers,
Joseph -
Hi Donna - you are most certainly not alone in your frustration. I would call my own feelings bordering on desperation. I'm frustrated, angry, nervous, guilty, and overwhelmed with a sense of powerlessness. It seems that every time we think we've identified a problem at the root of our Mozscape issues, things just get worse and new problems we never imagined arise.
On the padding issue, I have good news and depressing news. The good news is that we pad every estimate by nearly 2X. In a normal, problem-free index cycle, we can get it done in 12-14 days.... And yet, we never estimate less than 30-31 days for an index release. In the early part of this year, you might recall that we had a number of indices released back to back in that 2-3 week window. Then things took a turn for the worse and we've been struggling ever since.
I want to be honest - my belief is that we are going to get better, but the evidence of the last 6 months is against me. I want to believe my team and I know they are trying hard and doing everything they can to get this fixed. However, I think it's wise to have skepticism given the trajectory of the recent past.
Hope that's helpful and thank you for the comment.
-
Hi Donna,
I think it's fair to feel that way, it's definitely frustrating on all ends. While we do try hard to be as open and upfront as we can with information, we can most definitely work harder on getting it out sooner. I also very much appreciate the kind words about the community, and that you took the time to leave your thoughts. I'm sure many folks feel the same way. I'll let Rand (or others) jump in and respond as well, but I wanted to say thanks!
Jen
-
Before I comment, I want to say I am a loyal member of and contributor to the Moz community. In my opinion, it's unprecedented in it's openness, honesty and willingness to help others.
But I also want to express frustration cause I think I'm probably not alone in feeling it. While I understand and appreciate there are issues and Moz is doing everything it possibly can to address them, updates keep slipping and notifications are only given after the fact. We only know if there's going to be an update delay when the deadline passes and there's no change to API data.
Remember Scotty on the original Star Trek series? Chief engineer Montgomery "Scotty" Scott had a reputation for being a miracle worker because he routinely padded his estimates. I see the next updated scheduled for November 14 and am worried.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do you fetch website titles from paid api https://moz.com/help/guides/moz-api/mozscape/api-reference/url-metrics?
We are using one of your API: https://moz.com/help/guides/moz-api/mozscape/api-reference/url-metrics on our website and it does not show up the title for each website.
API | | SOSCreatives
But when I see the title of the same website through your extension Moz Bar, then it does show the title from the extension of that website. Can you tell me what is missing here?0 -
Sitemaps and Indexed Pages
Hi guys, I created an XML sitemap and submitted it for my client last month. Now the developer of the site has also been messing around with a few things. I've noticed on my Moz site crawl that indexed pages have dropped significantly. Before I put my foot in it, I need to figure out if submitting the sitemap has caused this.. can a sitemap reduce the pages indexed? Thanks David. TInSM
API | | Slumberjac0 -
Mozscape API Updates (Non-updates!) - becoming a joke!
This is the 3rd month in succession where the Mozscape index has been delayed. Myself and clients are losing patience with this, as I am sure many others must be. Just what do you suppose we tell clients waiting for that data? We have incomplete and sometimes skewed metrics to report on, delays which then get delayed further, with nothing but the usual 'we are working on it' and 'bear with us'. It's becoming obvious you fudged the index update back in January (see discussion here with some kind of explanation finally from Rand: https://moz.com/community/q/is-everybody-seeing-da-pa-drops-after-last-moz-api-update), and seems you have been fumbling around ever since trying to fix it, with data all over the place, shifting DA scores and missing links from campaign data. Your developers should be working around the clock to fix this, because this is a big part of what you're selling in your service, and as SEO's and marketers we are relying on that data for client retention and satisfaction. Will you refund us all if we should lose clients over this?! .. I don't think so! With reports already sent out the beginning of the month with incomplete data, I told clients the index would refresh April 10th as informed from the API updates page, only to see it fudged again on day of release with the index being rolled back to previous. So again, I have to tell clients there will be more delays, ...with the uncertainty of IF it WILL EVEN get refreshed when you say it will. It's becoming a joke.. really!
API | | GregDixson2 -
How to retrieve keyword difficulty information using Mozscape API?
Hi, Are we possible to use Mozscape API to retrieve keyword difficulty information for a list of keywords? I can't find its documentation. Thanks
API | | uceo0 -
/index.php causing a few issues
Hey Mozzers, Our site uses magento. Pages within the site (not categories or products) are set to display as www.domain.co.uk/page-url/ The hta access is set to redirect all version such as www.domain.co.uk/page-url to a url ending in a / However in google analytics and in moz landing page tracker these urls are being represented by www.domain.co.uk/page-url/index.php When visiting www.domain.co.uk/page-url/index.php a 404 is displayed. I know that by default when directed to a directory it automatically finds and displays the index file. So i understand why this is happening to some degree. However, when manually visiting this link does not exist. This poses a problem when trying to view the landing pages information in moz pro. I have 20 keywords being tracked in relation to www.domain.co.uk/page-url/ but because moz is recording it as www.domain.co.uk/page-url/index.php the keywords are unrelated so not showing information in relation to the page. Any ideas?
API | | ATP0 -
The April Index Update is Here!
Don’t adjust your monitors, or think this is an elaborate April Fool’s joke, we are actually releasing our April Index Update EARLY! We had planned to release our April Index Update on the 6th, but processing went incredibly smoothly and left us the ability to get it up today. Let’s dig into the details of the April Index Release: 138,919,156,028 (139 billion) URLs. 746,834,537 (747 million) subdomains. 190,170,132 (190 million) root domains. 1,116,945,451,603 (1.1 Trillion) links. Followed vs nofollowed links 3.02% of all links found were nofollowed 61.79% of nofollowed links are internal 38.21% are external Rel canonical: 28.14% of all pages employ the rel=canonical tag The average page has 90 links on it 73 internal links on average. 17 external links on average. Don’t let me hold you up, go dive into the data! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand:https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores
API | | IanWatson9 -
September's Mozscape Update Broke; We're Building a New Index
Hey gang, I hate to write to you all again with more bad news, but such is life. Our big data team produced an index this week but, upon analysis, found that our crawlers had encountered a massive number of non-200 URLs, which meant this index was not only smaller, but also weirdly biased. PA and DA scores were way off, coverage of the right URLs went haywire, and our metrics that we use to gauge quality told us this index simply was not good enough to launch. Thus, we're in the process of rebuilding an index as fast as possible, but this takes, at minimum 19-20 days, and may take as long as 30 days. This sucks. There's no excuse. We need to do better and we owe all of you and all of the folks who use Mozscape better, more reliable updates. I'm embarassed and so is the team. We all want to deliver the best product, but continue to find problems we didn't account for, and have to go back and build systems in our software to look for them. In the spirit of transparency (not as an excuse), the problem appears to be a large number of new subdomains that found their way into our crawlers and exposed us to issues fetching robots.txt files that timed out and stalled our crawlers. In addition, some new portions of the link graph we crawled exposed us to websites/pages that we need to find ways to exclude, as these abuse our metrics for prioritizing crawls (aka PageRank, much like Google, but they're obviously much more sophisticated and experienced with this) and bias us to junky stuff which keeps us from getting to the good stuff we need. We have dozens of ideas to fix this, and we've managed to fix problems like this in the past (prior issues like .cn domains overwhelming our index, link wheels and webspam holes, etc plagued us and have been addressed, but every couple indices it seems we face a new challenge like this). Our biggest issue is one of monitoring and processing times. We don't see what's in a web index until it's finished processing, which means we don't know if we're building a good index until it's done. It's a lot of work to re-build the processing system so there can be visibility at checkpoints, but that appears to be necessary right now. Unfortunately, it takes time away from building the new, realtime version of our index (which is what we really want to finish and launch!). Such is the frustration of trying to tweak an old system while simultaneously working on a new, better one. Tradeoffs have to be made. For now, we're prioritizing fixing the old Mozscape system, getting a new index out as soon as possible, and then working to improve visibility and our crawl rules. I'm happy to answer any and all questions, and you have my deep, regretful apologies for once again letting you down. We will continue to do everything in our power to improve and fix these ongoing problems.
API | | randfish11