Googlebot Crawl Rate causing site slowdown
-
I am hearing from my IT department that Googlebot is causing as massive slowdown/crash our site. We get 3.5 to 4 million pageviews a month and add 70-100 new articles on the website each day. We provide daily stock research and marke analysis, so its all high quality relevant content. Here are the crawl stats from WMT:
I have not worked with a lot of high volume high traffic sites before, but these crawl stats do not seem to be out of line. My team is getting pressure from the sysadmins to slow down the crawl rate, or block some or all of the site from GoogleBot.
Do these crawl stats seem in line with sites? Would slowing down crawl rates have a big effect on rankings?
Thanks
-
Similar to Michael, my IT team is saying Googlebot is causing performance issues - specifically during peak hours.
It was suggested that we consider using apache re-write rules to serve Googlebot a 503 during our peak hours to limit the impact. I found the stackoverflow thread (link below) in which John Muller seems to suggest this approach, but has anyone tried this?
-
Blocking googlebot is a quick and easy way to disappear from the Index. Not an option if you want Google to rank your site.
For smaller sites or ones with limited technologies, I sometimes recommend using a crawl-delay directive in robots.txt
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=48620
But I agree with both Shane and Zachary, this doesn't seem like the long term answer to your problems. Your crawl stats don't seem out of line for a site of your size, and perhaps a better hardware configuration could help things out.
With 70 new articles each day, I'd want Google crawling my site as much as they pleased.
-
whatever Google's default is in GWT - It sets it for you.
You can change it, but it is not reccomended unless for a specific reason (such as Michael Lewis's specific scenario) even though, I am not completely sold that Gbot is what is causing the "dealbreaking" overhead.
-
what is the ideal setting on the crawler. i have been wondering about this for some time.
-
Hi,
Your admins saying that, is like someone saying "we need to shut the site down, we are getting to much traffic!" Common sys-admin response (fix it somewhere else)
4GB a day downloaded, is alot of Bot traffic, but it appears you are a "real time" site, that is probably actually helped and maybe even reliant on your high crawl rate....
I would upgrade hardware - or even look into some kind of off site cloud redundancy for failover (Hybrid)
I highly doubt that 4GB a day, is a "dealbreaker",but of course that is just based off the one image, and your admins probably have resource monitors - Maybe Varnish is an answer for static content to help lighten load???? Or CDN for file hosting to lighten bandwidth load?
Shane
-
We are hosting the site on our own hardware at a big colo. I know that we are upgrading servers but they will not be online until the end of July.
Thanks!
-
I wouldn't slow the crawl rate. A high crawl rate is good so that Google can keep their index of your website current.
The better solution is to reconsider your hardware and networking setup. Do you know how you are being hosted? From my own experience with a website of that size, a load balancer on two decent dedicated servers should handle the load without problems. Google crawling your pages shouldn't create noticeable overhead on the right setup.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My site Metrics are not as per they should
Hi, I am regularly making links on my site to improve its metrics but i am confused how other people fastly improve their DA/PA and my DA/PA is not improving with that site. The same happened with spam score. It has been a month i disavow my links having spam score but instead of decrease in it, my spam score increased. Please advice. Is there any special way to use that help moz crawler to check site and update accordingly? Please help
Technical SEO | | AzadSeo37310 -
Launch of improved site
Hi, Just want to ask you guys if i have missed something in my planning. We have done a migration from Ithemes Exchange to woocommerce. The complete migration are done on our dev server. It has an exakt setup as our live one. My plan is to change our live version with a backup from our migrated and finished site from our dev site. All of our product links will be intact with accept from some that we have combined in to new ones, the ones that are changed has been redirected with a 301. Will this way of launching our site effect our ranking/seo in some way? Thankful for any thoughts about this one! // Jonas
Technical SEO | | knubbz0 -
Why Can't Googlebot Fetch Its Own Map on Our Site?
I created a custom map using google maps creator and I embedded it on our site. However, when I ran the fetch and render through Search Console, it said it was blocked by our robots.txt file. I read in the Search Console Help section that: 'For resources blocked by robots.txt files that you don't own, reach out to the resource site owners and ask them to unblock those resources to Googlebot." I did not setup our robtos.txt file. However, I can't imagine it would be setup to block google from crawling a map. i will look into that, but before I go messing with it (since I'm not familiar with it) does google automatically block their maps from their own googlebot? Has anyone encountered this before? Here is what the robot.txt file says in Search Console: User-agent: * Allow: /maps/api/js? Allow: /maps/api/js/DirectionsService.Route Allow: /maps/api/js/DistanceMatrixService.GetDistanceMatrix Allow: /maps/api/js/ElevationService.GetElevationForLine Allow: /maps/api/js/GeocodeService.Search Allow: /maps/api/js/KmlOverlayService.GetFeature Allow: /maps/api/js/KmlOverlayService.GetOverlays Allow: /maps/api/js/LayersService.GetFeature Disallow: / Any assistance would be greatly appreciated. Thanks, Ruben
Technical SEO | | KempRugeLawGroup1 -
How About Moving My Site to Another Domain?
My website has been hit by a Google Penguin penalty. At this point, it seems very few websites have recovered from this penalty even after it was lifted. That said, I'm still wondering if I should start a new website or my current one to another domain. When I started my website, I registered the domain name http://www.thewebhostinghero.com A few years later, I bought webhostinghero.com as it became available. Actually, webhostinghero.com redirects traffic to thewebhostinghero.com at the registrar level. There are barely any links pointing to webhostinghero.com. TheWebHostingHero.com is quite a big website with over 1400 pages and some very useful tools (http://www.thewebhostinghero.com/free-tools) that took me so long to develop, I just can't believe I'd have to throw it all out the window. So that said, do you think it would be possible to move all or some of the content from thewebhostinghero.com to webhostinghero.com? Would that make sense? Would webhostinghero.com come out as a copy of thewebhostinghero.com since the domain is similar, etc.? Of course, moving the content from thewebhostinghero.com to webhostinghero.com would have to be done without any 301/302 redirects to avoid passing bad link juice. At the same time, I don't want to end up with duplicate content on both domains.
Technical SEO | | sbrault740 -
301 redirecting old content from one site to updated content on a different site
I have a client with two websites. Here are some details, sorry I can't be more specific! Their older site -- specific to one product -- has a very high DA and about 75K visits per month, 80% of which comes from search engines. Their newer site -- focused generally on the brand -- is their top priority. The content here is much better. The vast majority of visits are from referrals (mainly social channels and an email newsletter) and direct traffic. Search traffic is relatively low though. I really want to boost search traffic to site #2. And I'd like to piggy back off some of the search traffic from site #1. Here's my question: If a particular article on site #1 (that ranks very well) needs to be updated, what's the risk/reward of updating the content on site #2 instead and 301 redirecting the original post to the newer post on site #2? Part 2: There are dozens of posts on site #1 that can be improved and updated. Is there an extra risk (or diminishing returns) associated with doing this across many posts? Hope this makes sense. Thanks for your help!
Technical SEO | | djreich0 -
New Site Search Critique
Hi I am a huge fan of the SEOMOZ site and this great community which has helped me learn the current SEO skills I have now which are still very basic compared to the pros on the forum. I have tried to follow best practice regarding onsite and technical seo when developing my new site www.cheapfindergames.com and I would really appreciate it if experts on the forum could spare a minute to critique the site from a search perspective please This will give any elements of what onsite and technical SEO I done well and what aspects still need work. I am currently trying to build quality links and social mentions into the site which will take time, and the site has been designed around usability and conversions. Many Thanks Ian
Technical SEO | | ocelot0 -
Video submission sites
Hello, What are the top 5 sites for video submissions ? Any suggestions about which points should be taken into consideration when submitting videos ? Thanks
Technical SEO | | seoug_20050 -
Site Structure question
when deciding the Site structure for a e-commerce site Is it better to keep everything mysite.com/widget.html or use categories like mysite.com/Gifts/widget.html
Technical SEO | | DavidKonigsberg0