Does it really take 3-7 days to crawl a site?
-
This is what you say on your website:
Please note: depending on the size and speed of your site, it may take between 3 and 7 days to complete your crawl.
Are you guys kidding? This is unacceptable for almost all of my deadlines, and I suspect 99% of the SEO world... you can fly to the moon faster.
-
I hear you - it's a frustration, but in a system that needs to crawl 30K+ sites each week without hitting anyone's server too hard, being a bother to a SysAdmin and keeping up with a massively complex, ever-changing queue, it's the reality.
That said, we do have a custom crawl tool you can fire anytime and usually get data back within just a few hours! It's here - http://www.seomoz.org/labs/cc
The intent/value behind the crawl inside the campaign is much less about a specific, one-time crawl, and more about having data every week, with historical information showing progress, updates and warnings (in case something goes wrong). There's lots of good free tools as well for single-purpose crawls, e.g. http://www.seomoz.org/blog/xenu-link-sleuth-more-than-just-a-broken-links-finder
Also, just to be totally clear - the system running the crawls for the web app in you campaigns is different than the Linkscape web index (which only updates every 3-4 weeks). Eventually, the two might merge, but we didn't want to bias any crawling inside Linkscape when we launched the web app last September.
-
The time it takes is not based off of your site alone, there are so may sites and links being crawled which all take processing power and bandwidth.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEOmoz crawler not crawling my site
We set up a new campaign in SEOmoz on Friday. It is my understanding that the preliminary crawl can cover up to 250 and this has been our experience in the past. However, the preliminary crawl only went through 2 pages. This is a larger eCommerce site with many pages. Any ideas why more pages weren't crawled? We set up the campaign to track at the root domain level.
Moz Pro | | IMM0 -
Why is my domain not being crawled anymore?
I just noticed that right around 12/1/2012, SEOMoz stopped crawling all but two pages out of the 400 or so on my website at www.TrustworthyCare.com . I speculate that this is probably due to some dumb mistake I made at that time, but I can't for the life of me figure out what that mistake was. Before that, the weekly crawls included all 400 or so pages. I wonder whether it's something that changed in our .htaccess file. Here's how that file looks now; can anyone see what is wrong there, or perhaps offer other suggestions if it doesn't look like anything is wrong in it? Thanks! Tim PS - I'm a small business owner, not an SEO or software engineer. PPS - I found and read this page, but I've pretty much tried the things described there (I think): https://seomoz.zendesk.com/entries/409821-why-isn-t-my-site-being-crawled-you-re-not-crawling-all-my-pages ================================= RewriteCond %{HTTP_HOST} ^aservantsheartcare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.aservantsheartcare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^aservantsheartcaremanagement.com$ [OR]RewriteCond %{HTTP_HOST} ^www.aservantsheartcaremanagement.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^aservantsheartgeriatriccare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.aservantsheartgeriatriccare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^aservantsheartgeriatriccaremanagement.com$ [OR]RewriteCond %{HTTP_HOST} ^www.aservantsheartgeriatriccaremanagement.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^aservantshearthomecare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.aservantshearthomecare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^aservantsheartseniorcare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.aservantsheartseniorcare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^aservantsheartservices.com$ [OR]RewriteCond %{HTTP_HOST} ^www.aservantsheartservices.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^careforparents.com$ [OR]RewriteCond %{HTTP_HOST} ^www.careforparents.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^eldercareradio.com$ [OR]RewriteCond %{HTTP_HOST} ^www.eldercareradio.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^helpforyourparents.com$ [OR]RewriteCond %{HTTP_HOST} ^www.helpforyourparents.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^privatedutyseniorcare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.privatedutyseniorcare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^sandiegocaremanagement.com$ [OR]RewriteCond %{HTTP_HOST} ^www.sandiegocaremanagement.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^sandiegocaremanager.com$ [OR]RewriteCond %{HTTP_HOST} ^www.sandiegocaremanager.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^sandiegogeriatriccaremanagement.com$ [OR]RewriteCond %{HTTP_HOST} ^www.sandiegogeriatriccaremanagement.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^sandiegogeriatriccaremanager.com$ [OR]RewriteCond %{HTTP_HOST} ^www.sandiegogeriatriccaremanager.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^servantsheartcare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.servantsheartcare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^servantshearthomecare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.servantshearthomecare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^servantsheartseniorcare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.servantsheartseniorcare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^tlccare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.tlccare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^tlcseniorcenter.com$ [OR]RewriteCond %{HTTP_HOST} ^www.tlcseniorcenter.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^tlcseniorhomecare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.tlcseniorhomecare.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] RewriteCond %{HTTP_HOST} ^tlcseniorservices.com$ [OR]RewriteCond %{HTTP_HOST} ^www.tlcseniorservices.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] #php_value upload_max_filesize 8MRewriteCond %{HTTP_HOST} ^trustworthycare.com$RewriteRule ^(.)$ "http://www.trustworthycare.com/$1" [R=301,L] RewriteCond %{HTTP_REFERER} !^$RewriteCond %{HTTP_REFERER} !^http://blog.trustworthycare.com/.$ [NC]RewriteCond %{HTTP_REFERER} !^http://blog.trustworthycare.com$ [NC]RewriteCond %{HTTP_REFERER} !^http://test.trustworthycare.com/.$ [NC]RewriteCond %{HTTP_REFERER} !^http://test.trustworthycare.com$ [NC]RewriteCond %{HTTP_REFERER} !^http://trustworthycare.com/.$ [NC]RewriteCond %{HTTP_REFERER} !^http://trustworthycare.com$ [NC]RewriteCond %{HTTP_REFERER} !^http://www.blog.trustworthycare.com/.$ [NC]RewriteCond %{HTTP_REFERER} !^http://www.blog.trustworthycare.com$ [NC]RewriteCond %{HTTP_REFERER} !^http://www.test.trustworthycare.com/.$ [NC]RewriteCond %{HTTP_REFERER} !^http://www.test.trustworthycare.com$ [NC]RewriteCond %{HTTP_REFERER} !^http://www.trustworthycare.com/.$ [NC]RewriteCond %{HTTP_REFERER} !^http://www.trustworthycare.com$ [NC]RewriteCond %{HTTP_REFERER} !^http://www.trustworthycare.com/images/files_for_service_inquiries/.$ [NC]RewriteCond %{HTTP_REFERER} !^http://www.trustworthycare.com/images/files_for_service_inquiries$ [NC]RewriteCond %{HTTP_REFERER} !^http://sandbox.trustworthycare.com/.$ [NC]RewriteCond %{HTTP_REFERER} !^http://sandbox.trustworthycare.com$ [NC]RewriteRule ..(jpg|jpeg|gif|png|bmp)$ - [F,NC] RewriteCond %{HTTP_HOST} ^ashsc.com$ [OR]RewriteCond %{HTTP_HOST} ^www.ashsc.com$RewriteRule ^/?$ "http://trustworthycare.com/" [R=301,L] # BEGIN W3TC Browser Cache BrowserMatch ^Mozilla/4 gzip-only-text/html BrowserMatch ^Mozilla/4.0[678] no-gzip BrowserMatch \bMSIE !no-gzip !gzip-only-text/html BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html Header append Vary User-Agent env=!dont-vary AddOutputFilterByType DEFLATE text/css application/x-javascript text/x-component text/html text/richtext image/svg+xml text/plain text/xsd text/xsl text/xml image/x-icon <filesmatch ".(css|js|htc|css|js|htc)$"=""></filesmatch> FileETag None Header set X-Powered-By "W3 Total Cache/0.9.2.5" <filesmatch ".(html|htm|rtf|rtx|svg|svgz|txt|xsd|xsl|xml|html|htm|rtf|rtx|svg|svgz|txt|xsd|xsl|xml)$"=""></filesmatch> FileETag None Header set X-Powered-By "W3 Total Cache/0.9.2.5" <filesmatch ".(asf|asx|wax|wmv|wmx|avi|bmp|class|divx|doc|docx|eot|exe|gif|gz|gzip|ico|jpg|jpeg|jpe|mdb|mid|midi|mov|qt|mp3|m4a|mp4|m4v|mpeg|mpg|mpe|mpp|otf|odb|odc|odf|odg|odp|ods|odt|ogg|pdf|png|pot|pps|ppt|pptx|ra|ram|svg|svgz|swf|tar|tif|tiff|ttf|ttc|wav|wma|wri|xla|xls|xlsx|xlt|xlw|zip|asf|asx|wax|wmv|wmx|avi|bmp|class|divx|doc|docx|eot|exe|gif|gz|gzip|ico|jpg|jpeg|jpe|mdb|mid|midi|mov|qt|mp3|m4a|mp4|m4v|mpeg|mpg|mpe|mpp|otf|odb|odc|odf|odg|odp|ods|odt|ogg|pdf|png|pot|pps|ppt|pptx|ra|ram|svg|svgz|swf|tar|tif|tiff|ttf|ttc|wav|wma|wri|xla|xls|xlsx|xlt|xlw|zip)$"=""></filesmatch> FileETag None Header set X-Powered-By "W3 Total Cache/0.9.2.5" # END W3TC Browser Cache# BEGIN W3TC Page Cache core RewriteEngine On RewriteBase / RewriteRule ^(./)?w3tc_rewrite_test$ $1?w3tc_rewrite_test=1 [L] RewriteCond %{HTTP:Accept-Encoding} gzip RewriteRule . - [E=W3TC_ENC:gzip] RewriteCond %{REQUEST_METHOD} !=POST RewriteCond %{QUERY_STRING} ="" RewriteCond %{HTTP_HOST} =www.trustworthycare.com RewriteCond %{REQUEST_URI} /$ [OR] RewriteCond %{REQUEST_URI} (sitemap(index)?.xml(.gz)?|[a-z0-9-]+-sitemap([0-9]+)?.xml(.gz)?) [NC] RewriteCond %{REQUEST_URI} !(/wp-admin/|/xmlrpc.php|/wp-(app|cron|login|register|mail).php|/feed/|wp-.*.php|index.php) [NC,OR] RewriteCond %{REQUEST_URI} (wp-comments-popup.php|wp-links-opml.php|wp-locations.php) [NC] RewriteCond %{HTTP_COOKIE} !(comment_author|wp-postpass|wordpress[a-f0-9]+|wordpress_logged_in) [NC] RewriteCond %{HTTP_USER_AGENT} !(W3\ Total\ Cache/0.9.2.5) [NC] RewriteCond "%{DOCUMENT_ROOT}/sitectrl/wp-content/w3tc/pgcache/%{REQUEST_URI}/_index%{ENV:W3TC_UA}%{ENV:W3TC_REF}%{ENV:W3TC_SSL}.html%{ENV:W3TC_ENC}" -f RewriteRule .* "/sitectrl/wp-content/w3tc/pgcache/%{REQUEST_URI}/_index%{ENV:W3TC_UA}%{ENV:W3TC_REF}%{ENV:W3TC_SSL}.html%{ENV:W3TC_ENC}" [L]# END W3TC Page Cache core# BEGIN WordPressRewriteEngine OnRewriteBase /RewriteRule ^index.php$ - [L]RewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteRule . /index.php [L] # END WordPressRewriteCond %{HTTP_HOST} ^privatedutycare.com$ [OR]RewriteCond %{HTTP_HOST} ^www.privatedutycare.com$RewriteRule ^/?$ "http://www.ageassistance.com" [R=301,L] =================================
Moz Pro | | tcolling0 -
Crawl Diagnostics Warnings - Duplicate Content
Hi All, I am getting a lot of warnings about duplicate page content. The pages are normally 'tag' pages. I have some news stories or blog posts tagged with multiple 'tags'. Should I ask google not to index the tag pages? Does it really affect my site? Thanks
Moz Pro | | skehoe0 -
Usable to set up campaign because site cannot be
I don't understand this message. i never had problems with other sites and now I get problems with this message when trying to set a campaign twice for 2 different sites. I received the same message twice. What do I do? Help! We have detected that the root domain xxxxxxxxxxxxxxxxxxxx does not respond to web requests. Using this domain, we will be unable to crawl your site or present accurate SERP information. Thanks.
Moz Pro | | mcuneo0 -
My Campaign has been crawling for about a week now
Can anyone tell me why one of my campaigns has been stuck in crawl mode for about a full week and it is still not done?!?!
Moz Pro | | nazmiyal0 -
Unsubscribe to weekly crawl notifications never works
Hello! All of my campaigns have the box 'Weekly crawl completed for campaign ...' unticked under Campaign Settings, yet for all of them I still receive an email regularly with the subject 'New crawl completed for ...'. How do I stop this? Is there a bug here? Adam Bishop
Moz Pro | | arbishop0 -
Is there such thing as a site free from errors?
Or is this a given? I am new to SEO and SEOmoz. One of my campaigns is completley free of errors...the others are a work in progress. Now I realize that SEO is never done, but can a site actually be free of errors? If so... I just gave myself a pat on the back.
Moz Pro | | AtoZion0