Controlling crawl speed/delay through dynamic server-code and 503's
-
Lately i'm experiencing performance trouble caused by bot traffic. Although Googlebot is not the worst (it's mainly bingbot and ahrefsbot), they cause heavy server load from time to time. We run a lot of sites on one server, so heavy traffic on one site impacts other site's performance.
Problem is that 1) I want a centrally managed solution for all sites (per site administration takes too much time), which 2) takes into account total server-load in stead of only 1 site's traffic and 3) controls overall bot-traffic in stead of controlling traffic for one bot. IMO user-traffic should always be prioritized higher than bot-traffic.
I tried "Crawl-delay:" in robots.txt, but Googlebot doesn't support that. Although my custom CMS system has a solution to centrally manage Robots.txt for all sites at once, it is read by bots per site and per bot, so it doesn't solve 2) and 3).
I also tried controlling crawl-speed through Google Webmaster Tools, which works, but again it only controls Googlebot (and not other bots) and is administered per site. No solution to all three of my problems.
Now i came up with a custom-coded solution to dynamically serve 503 http status codes to a certain portion of the bot traffic. What traffic-portion for which bots can be dynamically (runtime) calculated from total server load at that certain moment. So if a bot makes too much requests within a certain period (or whatever other coded rule i'll invent), some requests will be answered with a 503 while others will get content and a 200.
Remaining question is: Will dynamically serving 503's have a negative impact on SEO? OK, it will delay indexing speed/latency, but slow server-response-times do in fact have a negative impact on the ranking, which is even worse than indexing-latency.
I'm curious about your expert's opinions...
-
Hi INU,
I always like avoid using things like 503s as a general rule. There is almost certainly a better way to do it.
What about just using Google webmaster tools and Bing webmaster tools? Regarding HREFs it depends how much you rely on that tool. If you don't use it, then I'd more more likely to just block that bot in robots.txt and make sure Google and Bing are controlled using the appropriate tools in the respective webmaster tools.
To answer your specific point about whether or not 503 can hurt rankings. In general no as long as they are only short-term. A 503 like 404s or any other response code is a natural part of the web, however, Google has said in the past that repetitive 503s can be treated as permanent rather than temporary and in some cases can result in the pages being removed from the index.
I hope this helps,
Craig
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google admits it can take up to a year to refresh/recover your site after it is revoked from Penguin!
I found myself in an impossible situation where I was getting information from various people that seem to be "know it all's" but everything in my heart was telling me they were wrong when it came to the issues my site was having. I have been on a few Google Webmaster Hangouts and found many answers to questions I thought had caused my Penguin Penalty. After taking much of the advice, I submitted my Reconsideration Request for the 9th time (might have been more) and finally got the "revoke" I was waiting for on the 28th of MAY. What was frustrating was on May 22nd there was a Penguin refresh. This as far as I knew was what was needed to get your site back up in the organic SERPS. My Disavow had been submitted in February and only had a handful of links missing between this time and the time we received the revoke. We patiently waited for the next penguin refresh with the surety that we were heading in the right direction by John Mueller from Google (btw.. John is a great guy and really tries to help where he can). The next update came on October 4th and our rankings actually got worse! I spoke with John and he was a little surprised but did not go into any detail. At this point you have to start to wonder WHAT exactly is wrong with the website. Is this where I should rank? Is there a much deeper Panda issue. We were on the verge of removing almost all content from the site or even changing domains despite the fact that it was our brand name. I then created a tool that checked the dates of every last cached date of each link we had in our disavow file. The thought process was that Google had not re-crawled all the links and so they were not factored into the last refresh. This proved to be incorrect,all the links had been re-cached August and September. Nothing earlier than that,which would indicate a problem that they had not been cached in time. i spoke to many so called experts who all said the issue was that we had very few good links left,content issues etc.. Blah Blah Blah, heard it all before and been in this game since the late 90's, the site could not rank this badly unless there was an actual penalty as spam site ranked above us for most of our keywords. So just as we were about to demolish the site I asked John Mueller one more time if he could take a look at the site, this time he actually took the time to investigate,which was very kind of him. he came back to me in a Google Hangout in late December, what he said to me was both disturbing and a relief at the same time. the site STILL had a penguin penalty despite the disavow file being submitted in February over 10 months ago! And the revoke in May. I wrote this to give everyone here that has an authoritative site or just an old one, hope that not all is lots just yet if you are still waiting to recover in Google. My site is 10 years old and is one of the leaders in its industry. Sites that are only a few years old and have had unnatural link building penalties have recovered much faster in this industry which I find ridiculous as most of the time the older authoritative sites are the big trustworthy brands. This explains why Google SERPS have been so poor for the last year. The big sites take much longer to recover from penalties letting the smaller lest trustworthy sites prevail. I hope to see my site recover in the next Penguin refresh with the comfort of knowing that my site currently is still being held back by the Google Penguin Penalty refresh situation. Please feel free to comment below on anything you think is relevant.
White Hat / Black Hat SEO | | gazzerman10 -
If our site hasn't been hit with the Phantom Update, are we clear?
Our SEO provider created a bunch of "unique url" websites that have direct match domain names. The content is pretty much the same for over 130 websites (city name is different) that link directly to our main site. For me this was a huge red flag, but when I questioned them and they said it was fine. We haven't seen a drop in traffic, but concerned that Google just hasn't gotten to us. DA for each of these sites are 1 after several months. Should we be worried? I think yes, but I am an SEO newbie.
White Hat / Black Hat SEO | | Buddys0 -
Webiste Ranking Differently Based on IP/Data Center
I have a site which I thought was ranking well, however that doesn't seem to be the case. When I check the site from different IPs within the US it shows that the site is on page 1 and on other IPs it shows that it's on page 5 and for some keywords it shows it's not listed. This site was ranking well, before but I think google dropped it when I was giving putting in too much work with it (articles and press releases), but now it seems to have recovered when I check with my IP, but on other data centers it still shows it prior to recovering. It was able to recover after not building links to for a period of time, it showed it moved back up from the data center I'm connected to, but it still shows the possibly penalized results on other data centers. Is it possible that site is still penalized? So the question is why does it show it recovered in some data centers and not others? How do I fix this? It's been about 2 months since it's recovered from some data centers. Is this site still penalized or what's going on? There are no warnings in web master tools. Any insights would be appreciated! This isn't an issue with the rank tracking software, I've tested this on a multitude of IPs with varying differences. Thanks!
White Hat / Black Hat SEO | | seomozzy0 -
Are directory listings still appropriate in 2013? Aren't they old-style SEO and Penguin-worthy?
We have been reviewing our off-page SEO strategy for clients and as part of that process, we are looking at a number of superb info-graphics on the subject. I see that some of current ones still list "Directories" as being part of their off-page strategy. Aren't these directories mainly there for link-building purposes and provide Users no real benefit? I don't think I've ever seen a directory that I would use, apart for SEO research. Surely Google's Penguin algorithm would see directories in the same way and give them less value, or even penalise websites that use them to try to boost page rank? If I were to list my websites on directories it wouldn't be to share my lovely content with people that use directories to find great sites, it would be to sneakily build page rank. Am I missing the point? Thanks
White Hat / Black Hat SEO | | Crumpled_Dog
Scott0 -
What's the best way to set up 301's from an old off-site subdomain to a new off-site subdomain?
We are moving our Online store to a new service and we need to create 301's for all of the old product URLs. Being that the old store was hosted off-site, what is the best way to handle the 301 re-directs? Thanks!
White Hat / Black Hat SEO | | VermilionDesignInteractive0 -
Google-backed sites' link profiles
Curious what you SEO people think of the link profiles of these (high-ranking) Google-backed UK sites: http://www.opensiteexplorer.org/domains?site=www.startupdonut.co.uk http://www.opensiteexplorer.org/domains?site=www.lawdonut.co.uk http://www.opensiteexplorer.org/domains?site=www.marketingdonut.co.uk http://www.opensiteexplorer.org/domains?site=www.itdonut.co.uk http://www.opensiteexplorer.org/domains?site=www.taxdonut.co.uk Each site has between 40k and 50k inlinks counted in OSE. However, there are relatively few linking root domains in each case: 273 for marketingdonut 216 for startupdonut 90 for lawdonut 53 for itdonut 16 for taxdonut Is there something wrong with the OSE data here? Does this imply that the average root domain linking to the taxdonut site does so with 2857 links? The sites have no significant social media stats. The sites are heavily inter-linked. Also linked from the operating business, BHP Information Solutions (tagline "Gain access to SMEs"). Is this what Google would think of as a "natural" link profile? Interestingly, they've managed to secure links on quite a few UK local authority resources pages - generally being the only commercial website on those pages.
White Hat / Black Hat SEO | | seqal0 -
Herbal Viagra page same DA/PA as UC Berkeley??
Either there is some amazingly good SEO work going on here, or Google has an amazingly large hole in their metrics. http://nottowait.com/ http://www.ucdavis.edu/index.html The "nottowait" page has a PA of 85?! and a DA of 82?! The page is HORRIBLE. The page itself is an image of another page. The nav bar does not function, nor does any of the "click here" links. At the bottom there is a paragraph of keywords and broken english. This page is pure junk and should simply not have any value at all with respect to DA nor PA. It has a ton of incoming links from various sources which seem to be the source of all this value, which it passes on to other pages. This page really is an affront to the "content is king" concept. I suppose I should ask a question but all I can think of is, what is Matt Cutts' phone number? I want to ask him how this page has gotten away with being ranked so well for so long.
White Hat / Black Hat SEO | | RyanKent0 -
What does Youtube Consider Duplicate content and will it effect my ranking/traffic?
What does youtube consider duplicated content? If I have a power point type video that I already have on youtube and I want to change the beginning and end call to action, would that be considered duplicate content? If yes then how would this effect my ranking/youtube page. Will it make a difference if I have it embedded on my blog?
White Hat / Black Hat SEO | | christinarule0