Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should we use Google's crawl delay setting?
-
We’ve been noticing a huge uptick in Google’s spidering lately, and along with it a notable worsening of render times.
Yesterday, for example, Google spidered our site at a rate of 30:1 (google spider vs. organic traffic.) So in other words, for every organic page request, Google hits the site 30 times.
Our render times have lengthened to an avg. of 2 seconds (and up to 2.5 seconds). Before this renewed interest Google has taken in us we were seeing closer to one second average render times, and often half of that.
A year ago, the ratio of Spider to Organic was between 6:1 and 10:1.
Is requesting a crawl-delay from Googlebot a viable option?
Our goal would be only to reduce Googlebot traffic, and hopefully improve render times and organic traffic.
Thanks,
Trisha
-
Unfortunately you can't change crawl settings for Google in a robots.txt file, they just ignore it. The best way to rate limit them is using custom Crawl settings in Google Webmaster Tools. (look under Site configuration > Settings)
You also might want to consider using your loadbalancer to direct Google (and other search engines) to a "condomised" group of servers (app, db, cache, search) thereby ensuring your users arent inadvertantly hit by perfomance issues caused by over zealous bot crawling.
-
We're a publisher, which means that as an industry our normal render times are always at the top of the chart. Ads are notoriously slow to load, and that's how we earn our keep. These results are bad, though, even for publishing.
We're serving millions of uniques a month, on a bank of dedicated servers hosted off site, load balanced, etc.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will Google crawl and rank our ReactJS website content?
We have 250+ products dynamically inserted and sorted on our site daily (more specifically our homepage... yes, it's a long page). Our dev team would like to explore rendering the page server-side using ReactJS. We currently use a CDN to cache all the content, which of course we would like to continue using. SO... will Google be able to crawl that content? We've read some articles with different ideas (including prerendering): http://andrewhfarmer.com/react-seo/
Technical SEO | | Jane.com
http://www.seoskeptic.com/json-ld-big-day-at-google/ If we were to only load the schema important to the page (like product title, image, price, description, etc.) from the server and then let the client render the remaining content (comments, suggested products, etc.), would that go against best practices? It seems like that might be seen as showing the googlebot 1 version and showing the site visitor a different (more complete) version.0 -
Strange URL's for client's site
We just picked up a new client and I've been doing some digging around on their site. They have quite the wide variety of URL's that make for a rather confusing experience. One of the milder examples is their "About" page. Normally I would expect something along the lines of: www.website.com/about I see: www.website.com/default.asp?Page=About I'm typically a graphic designer and know basically nothing about code, but I just assume this has something funky to do with how their website was constructed. I'm assuming this isn't particularly SEO friendly, but it doesn't seem too bad. Until I got to another section of their site. It's a section that logically should look like: www.website.com/training/public-seminars It's: www.website.com/default.asp?Page=MT&Area=Seminars&Sub=MRM Now that's nonsensical to me! Normally if a client has terrible URL's, I'd say let's do some redirects, but I guess I'm a little intimidated by these. Do the URL's have to be structured like this for some reason? Am I missing some important area of coding here? However, the most bizarre example is a link back to their website from yellowpages.com. Where normally I would expect it to lead to their homepage, I get this bizarre-looking thing: http://website1-px.rtrk.com/?utm_source=ReachLocal&utm_medium=PPC&utm_campaign=AssetManagement&reference_id=15&publisher=yellowpages&placement=ypwebsitemip&action_target=listing_website And as you browse through the site, that strange domain stays. For example the About page is now: http://website1-px.rtrk.com/default.asp?Page=About I would try to google this but I have no idea where to even start! What is going on with these links? Will we be able to fix them to something presentable without breaking their website?
Technical SEO | | everestagency0 -
Why is Google's cache preview showing different version of webpage (i.e. not displaying content)
My URL is: http://www.fslocal.comRecently, we discovered Google's cached snapshots of our business listings look different from what's displayed to users. The main issue? Our content isn't displayed in cached results (although while the content isn't visible on the front-end of cached pages, the text can be found when you view the page source of that cached result).These listings are structured so everything is coded and contained within 1 page (e.g. http://www.fslocal.com/toronto/auto-vault-canada/). But even though the URL stays the same, we've created separate "pages" of content (e.g. "About," "Additional Info," "Contact," etc.) for each listing, and only 1 "page" of content will ever be displayed to the user at a time. This is controlled by JavaScript and using display:none in CSS. Why do our cached results look different? Why would our content not show up in Google's cache preview, even though the text can be found in the page source? Does it have to do with the way we're using display:none? Are there negative SEO effects with regards to how we're using it (i.e. we're employing it strictly for aesthetics, but is it possible Google thinks we're trying to hide text)? Google's Technical Guidelines recommends against using "fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash." If we were to separate those business listing "pages" into actual separate URLs (e.g. http://www.fslocal.com/toronto/auto-vault-canada/contact/ would be the "Contact" page), and employ static HTML code instead of complicated JavaScript, would that solve the problem? Any insight would be greatly appreciated.Thanks!
Technical SEO | | fslocal0 -
Can I use a 410'd page again at a later time?
I have old pages on my site that I want to 410 so they are totally removed, but later down the road if I want to utilize that URL again, can I just remove the 410 error code and put new content on that page and have it indexed again?
Technical SEO | | WebServiceConsulting.com0 -
Google insists robots.txt is blocking... but it isn't.
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site. When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section. Bing's webmaster tools are able to read the site and sitemap just fine. Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
Technical SEO | | ahockley0 -
Blank pages in Google's webcache
Hello all, Is anybody experiencing blanck page's in Google's 'Cached' view? I'm seeing just the page background and none of the content for a couple of my pages but when I click 'View Text Only' all of teh content is there. Strange! I'd love to hear if anyone else is experiencing the same. Perhaps this is something to do with the roll out of Google's updates last week?! Thanks,
Technical SEO | | A_Q
Elias0 -
How long will Google take to stop crawling an old URL once it has been 301 redirected
I need to do a clean-up old urls that have been redirected in sitemap and was wondering about this.
Technical SEO | | Ant-8080 -
A question about RSS feeds and nofollow's
With the nofollow tag used very widely on the internet these days I was just wondering about how an RSS feed might help me find a way around it. Basically my question is this : I post a comment on a blog, it's approved and my comment together with my link(nofollow tag applied) is there. Now when the blogs RSS feed updates, does this nofollow tag get applied to the feed? As far as I can tell it does not - but I'm not too clue'd up on how the feed is generated. Anyone want to help me understand how it works and if what I'm suggesting would be 'a way around the nofollow tag' ? Thanks 🙂
Technical SEO | | DanHill0