Crawl Diagnostic | Starter Crawl taken 14hrs.. so far
-
We started a starter crawl 14hrs ago and it's still going, can anyone help on why this is taking so long, when it says '2 hrs' on the interface..
Thanks,
Rory
-
Hi Rory. Most of our help desk is on holiday today, since it's the Fourth of July in the states. We do have a record of your ticket and one other person who is having a slow starter crawl, and a help desk specialist is looking into this now. Sorry for the delays.
Keri
-
I've asked — now heard yet, think i'll wait to hear.
Thanks for your help, appreciate it.
-
Send an email to help (at) seomoz.org for someone to have a look.
-
It's a fairly big site, but it does say:
'To get you started quickly Roger is crawling up to 250 pages on your site. You should see these results within two hours. The full crawl will complete within 7 days.'
There's no option to do anything else, like cancel, reset etc — it just says 'Starter crawl in progress', it's been 16hrs now + bit frustraing as needed to send this through to a client this morning.. Anyone from SeoMoz around to look into this?
-
And here is how you reset the crawl:
1. On your webserver, edit the robots.txt file.
2. Block the seomoz bot from crawling the site by blocking its access to the root.
You can do so by adding the following lines:
User-agent: rogerbot
Disallow: /
This would end the crawl session.
But, before you do this, it may a good idea to check if your site indeed has a lot of content and outgoing links?
-
Rory,
What is the sub-domain that you are crawling? It may just be that there is a lot of content to crawl.
-
How would I reset the crawl? I don't appear to have an option to?
-
Rory,
I would guess that this crawl session has hung-up; it would be a good idea to start a new session. The session could have been left in the middle due to a server side issue on your website or a temporary drop in connection between the API server and your website's server.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Site Crawl Stalled and Can't Restart
In my GreenSeed campaign, the site crawl continues to say "in progress." I can't figure out how to stop it or how to restart the site crawl. Can you please help?
Moz Pro | | Winger1 -
GOOGLE ANALYTIC SKEWED DATA BECAUSE OF GHOST REFERRAL SPAM ND CRAWL BOTS
Hi Guys, We are having some major problems with our Google Analytics and MOz account. Due to the large number of ghost/referral spam and crawler bots we have added some heavy filtering to GA. This seems to be working protecting the data from all these problems but also filtering out much needed data that is not coming through. In example, we used to get a hundred visitors a day at the least and now we are down to under ten. ANYBODY PLEASE HELP. HAVE READ THROUGH MANY ARTICLES WITH NO FIND TO PERMANENT SOLID SOLUTION (even willing to go with paid service instead of GA) Thank You so Much, S.M.
Moz Pro | | KristyKK0 -
Block Moz (or any other robot) from crawling pages with specific URLs
Hello! Moz reports that my site has around 380 duplicate page content. Most of them come from dynamic generated URLs that have some specific parameters. I have sorted this out for Google in webmaster tools (the new Google Search Console) by blocking the pages with these parameters. However, Moz is still reporting the same amount of duplicate content pages and, to stop it, I know I must use robots.txt. The trick is that, I don't want to block every page, but just the pages with specific parameters. I want to do this because among these 380 pages there are some other pages with no parameters (or different parameters) that I need to take care of. Basically, I need to clean this list to be able to use the feature properly in the future. I have read through Moz forums and found a few topics related to this, but there is no clear answer on how to block only pages with specific URLs. Therefore, I have done my research and come up with these lines for robots.txt: User-agent: dotbot
Moz Pro | | Blacktie
Disallow: /*numberOfStars=0 User-agent: rogerbot
Disallow: /*numberOfStars=0 My questions: 1. Are the above lines correct and would block Moz (dotbot and rogerbot) from crawling only pages that have numberOfStars=0 parameter in their URLs, leaving other pages intact? 2. Do I need to have an empty line between the two groups? (I mean between "Disallow: /*numberOfStars=0" and "User-agent: rogerbot")? (or does it even matter?) I think this would help many people as there is no clear answer on how to block crawling only pages with specific URLs. Moreover, this should be valid for any robot out there. Thank you for your help!0 -
Can I see when SEO Moz has crawled my website?
I would like to know if it's possible to see (maybe in my Google Analytics) if SEO Moz has crawled my website. I'm also curious if and where I can see when the robot of Google visited my website. Thanks!
Moz Pro | | Spotler0 -
Dot Net Nuke generating long URL showing up as crawl errors!
Since early July a DotNetNuke site is generating long urls that are showing in campaigns as crawl errors: long url, duplicate content, duplicate page title. URL: http://www.wakefieldpetvet.com/Home/tabid/223/ctl/SendPassword/Default.aspx?returnurl=%2F Is this a problem with DNN or a nuance to be ignored? Can it be controlled? Google webmaster tools shows no crawl errors like this.
Moz Pro | | EricSchmidt0 -
How to remove Duplicate content due to url parameters from SEOMoz Crawl Diagnostics
Hello all I'm currently getting back over 8000 crawl errors for duplicate content pages . Its a joomla site with virtuemart and 95% of the errors are for parameters in the url that the customer can use to filter products. Google is handling them fine under webmaster tools parameters but its pretty hard to find the other duplicate content issues in SEOMoz with all of these in the way. All of the problem parameters start with ?product_type_ Should i try and use the robot.txt to stop them from being crawled and if so what would be the best way to include them in the robot.txt Any help greatly appreciated.
Moz Pro | | dfeg0 -
Crawl slow again
Once again the weekly crawl on my site is very slow. I have around 441 pages in the crawl and this has been running for over 12 hours. This last happened two weeks ago (ran for over 48 hours). Last week's crawl was much quicker (not sure exactly how long but guessing an hour or so). Is this a known issue and is there anything that can be done to unblock it? Weekends are the best time for me to assess and respond to changes I have made to my site so having this (small) crawl take most of the weekend is really quite problematic. Thanks. Mark
Moz Pro | | MarkWill0 -
Can you help me get started using the crawl diagnostics report?
After getting the crawl diagnostics report for the first time my boss and I looked over it and we have tried to fix the problems but we are stumped.I have tried and watched videos , read books, etc.. but have found nothing to help. I need assistance getting started on improving my website. Can you help?
Moz Pro | | WVInjuryLawyer0