Crawl Disgnosis only crawling 250 pages not 10,000
-
My crawl diagnosis has suddenly dropped from 10,000 pages to just 250. I've been tracking and working on an ecommerce website with 102,000 pages (www.heatingreplacementparts.co.uk) and the history for this was showing some great improvements. Suddenly the CD report today is showing only 250 pages! What has happened? Not only is this frustrating to work with as I was chipping away at the errors and warnings, but also my graphs for reporting to my client are now all screwed up. I have a pro plan and nothing has (or should have!) changed.
-
Hey Scott,
I just checked out your campaigns and everything looks good right now. We are really sorry about any inconveniences this may have caused. Let me update you on what happened and what we have done to make sure it doesn't happen in the future.
Over the weekend our server hosting provider experienced some temporary power outages that last for a few hours. When this happened some of our databases that contain user membership status went offline. When this happened our crawlers assumed that the campaigns had been archived and when the database servers came back online then the crawlers thought the campaigns had been unarchived.
In the past we have had the practice of kicking off a 250 page starter crawl when a campaign has been unarchived and then scheduling the full crawl for 7 days out. Your campaign would have received a full crawl on it's next scheduled crawl though. This is much like what happens when you first create your campaign. This isn't ideal for a few reasons though. One being a scenario like what happened over the weekend and two that it can skew your historical data by having a 250 page crawl stuck in the middle, even if archiving was intentionally done.
Moving forward we will be implementing a change to this that makes it so when you unarchive a campaign your full crawl will be scheduled and you won't receive a starter crawl. If you need more immediate crawl data then I recommend using our crawl test tool. With that tool you can receive up to 3,000 pages crawled. The only difference being it comes in the form of a csv file without the pretty web interface.
Let me know if you have any additional questions. Also, in the future if you are experiencing any issues with your service go ahead an let our support team know. If you go to seomoz.org/help you can generate a help ticket quite easily. By generating a customer support ticket our Help Team will keep you up to date on any issues with your account and work with you to resolve any issues as quickly as possible.
Again, my sincere apologies for this issue with your crawl.
Have a great day!
Kenny
-
Many thanks Keri
-
Hi Scott,
We have rolled out a fix for this! I'm waiting to hear how long it will take to get through the backlog of crawls, but did want to let you know that your campaign is being worked on.
Keri
-
Thanks Keri. If you could please keep me informed that will help me to explain this to clients.
regards,
Scott.
-
I think we've had a bug, Scott. A couple of SEOmoz staff also got emails that the starter crawl had finished. We're looking into this to figure out what has happened, and really apologize. I'm assigning this to the help desk, and they'll commenting when we have some more information.
-
If you have run crawler today than yes seomoz default run 250 pages and than crawler takes 7 days to scan all your website pages..
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
WEbsite cannot be crawled
I have received the following message from MOZ on a few of our websites now Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster. I have spoken with our webmaster and they have advised the below: The Robots.txt file is definitely there on all pages and Google is able to crawl for these files. Moz however is having some difficulty with finding the files when there is a particular redirect in place. For example, the page currently redirects from threecounties.co.uk/ to https://www.threecounties.co.uk/ and when this happens, the Moz crawler cannot find the robots.txt on the first URL and this generates the reports you have been receiving. From what I understand, this is a flaw with the Moz software and not something that we could fix form our end. _Going forward, something we could do is remove these rewrite rules to www., but these are useful redirects and removing them would likely have SEO implications. _ Has anyone else had this issue and is there anything we can do to rectify, or should we leave as is?
Moz Pro | | threecounties0 -
Is my 404 page set up correctly?
HI there! In using a few different tools, and I have received varied result concerning my 404 page. In WebSite Auditor from Link Assistant, it says my page is set up incorrectly. My webmaster has told me that it is set up correctly, but at this point I'm not entirely sure what's what. Is there another way to test it? I guess i'm not quite sure about the jargon and what "return 404 response code" actually means. Thanks for your help!
Moz Pro | | NutcrackerBalletGifts0 -
Crawl Diagnostics 403 on home page...
In the crawl diagnostics it says oursite.com/ has a 403. doesn't say what's causing it but mentions no robots.txt. There is a robots.txt and I see no problems. How can I find out more information about this error?
Moz Pro | | martJ0 -
How to add a simple page to a campaing.
Hello, My domain is www.artes-plasticas-pollock.com. this domain must be positioned by one keyword, but, inside this domain there are more pages to be posicioned with another keywords. As example, inside the domain there is a page http://www.artes-plasticas-pollock.com/index-rafael-navarro.php that must be positioned by the keyword "Rafael Navarro" ¿ How can I configure it ? May I create a new campaing ? Is it possible to create this page inside the existing campaing related to the main url www.artes-plasticas-pollock ? Please ... any information will be pleased. thanks pilar.
Moz Pro | | OkTuWeb0 -
Campaign Not Crawling
I set up my first 5 campaigns and one is not crawling beyond one-page. It's been over 48 hours. This site has nearly 3.500 pages the others much less, however, this shouldn't make any difference. I searched for the problem and couldn't find it so I hope this question isn't redundant. Comments and advice would be appreciated.
Moz Pro | | JavaManOne0 -
Redirecting duplicate .asp pages??
Hi all, I have a bit of a problem with duplicate content on our website. The CMS has been creating identical duplicate pages depending on which menu route a user takes to get to a product (i.e. via the side menu button or the top menu bar). Anyway, the web design company we use are sorting it out going forward, and creating 301 redirects on the duplicate pages. My question is, some of the duplicates take two different forms. E.g. for the home page: www.<my domain="">.co.uk
Moz Pro | | gdavies09031977
www..<my domain="">.co.uk/index.html
www.<my domain="">.co.uk/index.asp</my></my></my> Now I understand the 'index.html' page should be redirected, but does the 'index.asp' need to be directed also? What makes this more confusing is when I run the SEOMoz diagnostics report (which brought my attention to the duplicate content issue in the first place - thanks SEOMoz), not all the .asp pages are identified as duplicates. For example, the above 'index.asp' page is identified as a duplicate, but 'contact-us.asp' is not highlighted as a duplicate to 'contact-us.html'? I'm a bit new to all this (I'm not a IT specialist), so any clarification anyone can give would be appreciated. Thanks, Gareth0 -
SEOMoz only crawling 5 pages of my website
Hello, I've added a new website to my SEOmoz campaign tool. It only crawls 5 pages of the site. I know the site has way more pages then this and also has a blog. Google shows at least 1000 results indexed. Am I doing something wrong? Could it be that the site is preventing a proper crawl? Thanks Bill
Moz Pro | | wparlaman0 -
Initial Crawl Questions
Hello. I just joined and used the Crawl tool. I have many questions and hoping the community can offer some guidance. 1. I received an Excel file with 3k+ records. Is there a friendly online viewer for the Crawl report? Or is the Excel file the only output? 2. Assuming the Excel file is the only output, the Time Crawled is a number (i.e. 1305798581). I have tried changing the field to a date/time format but that did not work. How can I view the field as a normal date/time such as May 15, 2011 14:02? 3. I use the ™ symbol in my Title. This symbol appears in the output as a few ascii characters. Is that a concern? Should I remove the trademark symbol from my Title? 4. I am using XenForo forum software. All forum threads automatically receive a Title Tag and Meta Description as part of a template. The Crawl Test report shows my Title Tag and Meta Description as blank for many threads. I have looked at the source code of several pages and they all have clean Title tags and I don't understand why the Crawl Report doesn't show them. Any ideas? 5. In some cases the HTTP Status Code field shows a result of "3". Why does that mean? 6. For every URL in the Crawl Report there is an entry in the Referrer field. What exactly is the relationship between these fields? I thought the Crawl Tool would inspect every page on the site. If a page doesn't have a referring page is it missed? What if a page has multiple referring pages? How is that information displayed? 7. Under Google Webmaster Tools > Site Configurations > Settings > Parameter Handling I have the options set as either "Ignore" or "Let Google Decide" for various URL parameters. These are "pages" of my site which should mostly be ignored. For example a forum may have 7 headers, each on of which can be sorted in ascending or descending order. The only page that matters is the initial page. All the rest should be ignored by Google and the Crawl. Presently there are 11 records for many pages which really should only have one record due to these various sort parameters. Can I configure the crawl so it ignores parameter pages? I am anxious to get started on my site. I dove into the crawl results and it's just too messy in it's present state for me to pull out any actionable data. Any guidance would be appreciated.
Moz Pro | | RyanKent0