Googlebot Crawl Rate causing site slowdown
-
I am hearing from my IT department that Googlebot is causing as massive slowdown/crash our site. We get 3.5 to 4 million pageviews a month and add 70-100 new articles on the website each day. We provide daily stock research and marke analysis, so its all high quality relevant content. Here are the crawl stats from WMT:
I have not worked with a lot of high volume high traffic sites before, but these crawl stats do not seem to be out of line. My team is getting pressure from the sysadmins to slow down the crawl rate, or block some or all of the site from GoogleBot.
Do these crawl stats seem in line with sites? Would slowing down crawl rates have a big effect on rankings?
Thanks
-
Similar to Michael, my IT team is saying Googlebot is causing performance issues - specifically during peak hours.
It was suggested that we consider using apache re-write rules to serve Googlebot a 503 during our peak hours to limit the impact. I found the stackoverflow thread (link below) in which John Muller seems to suggest this approach, but has anyone tried this?
-
Blocking googlebot is a quick and easy way to disappear from the Index. Not an option if you want Google to rank your site.
For smaller sites or ones with limited technologies, I sometimes recommend using a crawl-delay directive in robots.txt
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=48620
But I agree with both Shane and Zachary, this doesn't seem like the long term answer to your problems. Your crawl stats don't seem out of line for a site of your size, and perhaps a better hardware configuration could help things out.
With 70 new articles each day, I'd want Google crawling my site as much as they pleased.
-
whatever Google's default is in GWT - It sets it for you.
You can change it, but it is not reccomended unless for a specific reason (such as Michael Lewis's specific scenario) even though, I am not completely sold that Gbot is what is causing the "dealbreaking" overhead.
-
what is the ideal setting on the crawler. i have been wondering about this for some time.
-
Hi,
Your admins saying that, is like someone saying "we need to shut the site down, we are getting to much traffic!" Common sys-admin response (fix it somewhere else)
4GB a day downloaded, is alot of Bot traffic, but it appears you are a "real time" site, that is probably actually helped and maybe even reliant on your high crawl rate....
I would upgrade hardware - or even look into some kind of off site cloud redundancy for failover (Hybrid)
I highly doubt that 4GB a day, is a "dealbreaker",but of course that is just based off the one image, and your admins probably have resource monitors - Maybe Varnish is an answer for static content to help lighten load???? Or CDN for file hosting to lighten bandwidth load?
Shane
-
We are hosting the site on our own hardware at a big colo. I know that we are upgrading servers but they will not be online until the end of July.
Thanks!
-
I wouldn't slow the crawl rate. A high crawl rate is good so that Google can keep their index of your website current.
The better solution is to reconsider your hardware and networking setup. Do you know how you are being hosted? From my own experience with a website of that size, a load balancer on two decent dedicated servers should handle the load without problems. Google crawling your pages shouldn't create noticeable overhead on the right setup.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When rogerbot tried to crawl my site it gets a 404\. Why?
When rogerbot tries to craw my site it tries http://website.com. My website then tries to redirect to http://www.website.com and is throwing a 404 and ends up not getting crawled. It also throws a 404 when trying to read my robots.txt file for some reason. We allow rogerbot user agent so unsure whats happening here. Is there something weird going on when trying to access my site without the 'www' that is causing the 404? Any insight is helpful here. Thanks,
Technical SEO | | BlakeBooth0 -
Our stage site got crawled and we got an unnatural inbound links warning. What now?
live site: www.mybarnwoodframes.com stage site: www.methodseo.net We recently finished a redesign of our site to improve our navigation. Our developer insisted on hosting the stage site on her own server with a separate domain while she worked on it. However, somebody left the site turned on one day and Google crawled the entire thing. Now we have 4,320 pages of 100% identical duplicate content with this other site. We were upset but didn't think that it would have any serious repercussions until we got two orders from customers from the stage site one day. Turns out that the second site was ranking pretty decently for a duplicate site with 0 links, the worst was yet to come however. During the 3 months of the redesign our rankings on our live site dropped and we suffered a 60% drop in organic search traffic. On May 22, 2013 day of the Penguin 2.0 release we received an unnatural inbound links warning. Google webmaster tools shows 4,320 of our 8,000 links coming from the stage site domain to our live site, we figure that was the cause of the warning. We finished the redesign around May 14th and we took down the stage site, but it is still showing up in the search results and the 4,320 links are still showing up in our webmaster tools. 1. Are we correct to assume that it was the stage site that caused the unnatural links warning? 2. Do you think that it was the stage site that caused the drop in traffic? After doing a link audit I can't find any large amount of horrendously bad links coming to the site. 3. Now that the stage site has been taken down, how do we get it out of Google's indexes? Will it be taken out over time or do we need to do something on our end for it to be delisted? 4. Once it's delisted the links coming from it should go away, in the meantime however, should we disavow all of the links from the stage site? Do we need to file a reconsideration request or should we just be patient and let them go away naturally? 5. Do you think that our rankings will ever recover?
Technical SEO | | gallreddy0 -
Webmaster Tools Links To Your Site
I logged onto webmaster tools today for my site and the section 'Links to Your Site' is showing no data. Also if I search using link:babskibaby.com it only shows 1 link. My site had been showing 500+ links previously. Does anyone know why this is?
Technical SEO | | babski0 -
Impact of Adding a Mobile Site
Hi, we ranked very well for keywords trophies and trophies and awards on our home page, trophycentral.com for quite a while (many years). Recently we dropped off the charts, but are not sure why. So we posted this issue last week and got some great suggestions and are in the process of addressing them. However, we are now wondering if we caused this issue when we launched our mobile site a few months ago (timing makes sense). Has anyone had trouble with a mobile site impacting their traditional site? I am wondering if maybe google is splitting the traffic to the trophycentral domain and the m.trophycentral.com domain? Here is the code we have< script type="text/javascript" src="http://lib.store.yahoo.net/lib/sportsawards/mobile-redirection.js">script>Appreciate your comments!
Technical SEO | | trophycentraltrophiesandawards0 -
Redirect from old wordpress site to new php site? Best approach
Hi I have two websites one legacy site done in wordpress the other in php. However I would like to merge the two together and remove the wordpress site. However it has a good link profile and the pages rank well. What is the best approach to do a 301 redirect from the old site with all its pages pointing to the homepage of the new site? If so what's the best way to do this in wordpress? Many thanks
Technical SEO | | ocelot0 -
Suggested crawl rate in google webmaster tools?
hey moz peeps, got a general question: what is the suggested custom crawl rate in google webmaster tools? or is it better to "Let Google determine my crawl rate (recommended)" If you guys have any good suggestions on this and site why that would be very helpful, thanks guys!
Technical SEO | | david3050 -
Will training videos available on the "members only" section of a site contribute to the sites ranking?
Hello, I got asked a question recently as to whether training videos on the deeper pages of a website (that you can only access if you are a member and log in) will help with the sites ranking. On the SEOMoz software these deeper pages have been crawled as far as I can tell with errors reported on pages from the "members only" section of the site, leading me to believe the members only pages and their content will contribute to the sites overall ranking profile. I have suggested uploading the informational videos on the main pages of the site for now, making them accessible to all visitors and putting them in a more obvious place to encourage more sharing and views, however I've also said I would check it out with some experts so any information will be greatly appreciated! Many thanks 🙂 Charlotte
Technical SEO | | CharlotteWaller0 -
Submitting site to dmoz.org
Over the last couple of years I've repeatedly submitted (about 4 times) our site to dmoz.org, hoping to get listed but have never been successful in getting the site recognized. We have an eCommerce site that deals in automotive parts and accessories. What does it take to get your site accepted in dmoz and how do you go about it? Thanks, Steve
Technical SEO | | SteveMaguire0