Should we use Google's crawl delay setting?
-
We’ve been noticing a huge uptick in Google’s spidering lately, and along with it a notable worsening of render times.
Yesterday, for example, Google spidered our site at a rate of 30:1 (google spider vs. organic traffic.) So in other words, for every organic page request, Google hits the site 30 times.
Our render times have lengthened to an avg. of 2 seconds (and up to 2.5 seconds). Before this renewed interest Google has taken in us we were seeing closer to one second average render times, and often half of that.
A year ago, the ratio of Spider to Organic was between 6:1 and 10:1.
Is requesting a crawl-delay from Googlebot a viable option?
Our goal would be only to reduce Googlebot traffic, and hopefully improve render times and organic traffic.
Thanks,
Trisha
-
Unfortunately you can't change crawl settings for Google in a robots.txt file, they just ignore it. The best way to rate limit them is using custom Crawl settings in Google Webmaster Tools. (look under Site configuration > Settings)
You also might want to consider using your loadbalancer to direct Google (and other search engines) to a "condomised" group of servers (app, db, cache, search) thereby ensuring your users arent inadvertantly hit by perfomance issues caused by over zealous bot crawling.
-
We're a publisher, which means that as an industry our normal render times are always at the top of the chart. Ads are notoriously slow to load, and that's how we earn our keep. These results are bad, though, even for publishing.
We're serving millions of uniques a month, on a bank of dedicated servers hosted off site, load balanced, etc.
-
more info on that here: http://www.robotstxt.org/
-
Wow! those are really high render times. Have you considered perhaps moving to another webserver? NginX is pretty damm fast, and could probably get those render times down. Also, are you on a shared host? or is this a dedicated server?
What you're looking for is the robots.txt file though, and you want to add some lines like this:
User-agent: * Disallow: Crawl-Delay: 10 User-agent: ia_archiver Disallow: / User-agent: Ask Jeeves Crawl-Delay: 120 User-agent: Teoma Disallow: /html/ Crawl-Delay: 120
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is my site crawled so much more in Bing then in Google?
I recently setup Cloudflare so I can see how much my site is being crawled. It looks like Bing is crawling me about 3 times as much as Google. Any ideas on why that would be or what I should check?
Technical SEO | | EcommerceSite0 -
Is there a way for me to automatically download a website's sitemap.xml every month?
From now on we want to store all our sitemap.xml over the next years. Its a nice archive to have that allows us to analyse how many pages we have on our website and which ones were removed/redirected. Any suggestions? Thanks
Technical SEO | | DeptAgency0 -
My blog page isn't ranking in Google
Hi, I noticed that my blog page on my site isn't in Google when i search for full URL link http://www.asggutter.com/blog/ instead i see page that isn't even working asggutter.com/sitemap.xml screen shot http://screencast.com/t/6OVFLwL8nTL How i can i fix that. Thanks
Technical SEO | | tonyklu0 -
How to get out of Google's sendbox
Hello, i posted this question before here in forum, that 2 of my pages were sendboxed but never had a clear answer on how to get them back up, i do know that i need to build high quality backlinks pointing to those pages, but where do i start? Thanks
Technical SEO | | tonyklu0 -
What's best practice for blog meta titles?
I have the option of placing meta titles on the actual blog, or on the blog category on my site. Should I have separate meta titles for each blog or bundle them under a category and try to drive traffic to the category? Can anyone help with best practice?
Technical SEO | | Lubeman0 -
Has Google stopped rendering author snippets on SERP pages if the author's G+ page is not actively updated?
Working with a site that has multiple authors and author microformat enabled. The image is rendering for some authors on SERP page and not for others. Difference seems to be having an updated G+ page and not having a constantly updating G+ page. any thoughts?
Technical SEO | | irvingw0 -
We have a decent keyword rich URL domain that's not being used - what to do with it?
We're an ecommerce site and we have a second, older domain with a better keyword match URL than our main domain (I know, you may be wondering why we didn't use it, but that's beside the point now). It currently ranks fairly poorly as there's very few links pointing to it. However, the exact match URL means it has some value, if we were to build a few links to it. What would you do with it: 301 product/category pages to current site's equivalent page Link product/category pages to current site's equivalent page Not bother using it at all Something else
Technical SEO | | seanmccauley0 -
Crawl Tool Producing Random URL's
For some reason SEOmoz's crawl tool is returning duplicate content URL's that don't exist on my website. It is returning pages like "mydomain.com/pages/pages/pages/pages/pages/pricing" Nothing like that exists as a URL on my website. Has anyone experienced something similar to this, know what's causing it, or know how I can fix it?
Technical SEO | | MyNet0