Controlling crawl speed/delay through dynamic server-code and 503's
-
Lately i'm experiencing performance trouble caused by bot traffic. Although Googlebot is not the worst (it's mainly bingbot and ahrefsbot), they cause heavy server load from time to time. We run a lot of sites on one server, so heavy traffic on one site impacts other site's performance.
Problem is that 1) I want a centrally managed solution for all sites (per site administration takes too much time), which 2) takes into account total server-load in stead of only 1 site's traffic and 3) controls overall bot-traffic in stead of controlling traffic for one bot. IMO user-traffic should always be prioritized higher than bot-traffic.
I tried "Crawl-delay:" in robots.txt, but Googlebot doesn't support that. Although my custom CMS system has a solution to centrally manage Robots.txt for all sites at once, it is read by bots per site and per bot, so it doesn't solve 2) and 3).
I also tried controlling crawl-speed through Google Webmaster Tools, which works, but again it only controls Googlebot (and not other bots) and is administered per site. No solution to all three of my problems.
Now i came up with a custom-coded solution to dynamically serve 503 http status codes to a certain portion of the bot traffic. What traffic-portion for which bots can be dynamically (runtime) calculated from total server load at that certain moment. So if a bot makes too much requests within a certain period (or whatever other coded rule i'll invent), some requests will be answered with a 503 while others will get content and a 200.
Remaining question is: Will dynamically serving 503's have a negative impact on SEO? OK, it will delay indexing speed/latency, but slow server-response-times do in fact have a negative impact on the ranking, which is even worse than indexing-latency.
I'm curious about your expert's opinions...
-
Hi INU,
I always like avoid using things like 503s as a general rule. There is almost certainly a better way to do it.
What about just using Google webmaster tools and Bing webmaster tools? Regarding HREFs it depends how much you rely on that tool. If you don't use it, then I'd more more likely to just block that bot in robots.txt and make sure Google and Bing are controlled using the appropriate tools in the respective webmaster tools.
To answer your specific point about whether or not 503 can hurt rankings. In general no as long as they are only short-term. A 503 like 404s or any other response code is a natural part of the web, however, Google has said in the past that repetitive 503s can be treated as permanent rather than temporary and in some cases can result in the pages being removed from the index.
I hope this helps,
Craig
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What does Google's Spammy Structured Markup Penalty consist of?
Hey everybody,
White Hat / Black Hat SEO | | klaver
I'm confused about the Spammy Structured Markup Penalty: "This site may not perform as well in Google results because it appears to be in violation of Google's Webmaster Guidelines." Does this mean the rich elements are simply removed from the snippets? Or will there be an actual drop in rankings? Can someone here tell from experience? Thanks for your help!1 -
It's possible a bounce-rate attack manipulate SEO?
My site has been visited by unusual users with one second session times. This leaves my analytics data confused.
White Hat / Black Hat SEO | | CompraBit0 -
Why my banklinks haven't been removed?
Hi Everyone So I had over 1500 backlinks in under month, and i found out it was coming from a directory. I asked them to delist me from the directory, but it still shows i have these links pointing to me. How do I get completely take them down? Also I contacted myseotools who I use and they said "It is most likely because you have some dynamic pages that can create thousands of various URLs. Maybe a directory? This is not an issue with our software as it comes directly from ahrefs. Try going to ahrefs.com and enter your domain to see where all the links are coming from." I proceeded to do this and its definely coming from that 1 directory. They said they have removed me from they directory, but my question is I can still see I have 1500 backlinks coming from their domain? Does this take time to clear? Or have I missed something in the process?
White Hat / Black Hat SEO | | edward-may0 -
Site that's 301 redirected is ranking for brand
We own a number of foreign TLD domains for our brand. They are all 301-redirected to our main .com branded domain. One of them is appearing in our branded search results, outranking out main .com page. To be clear, this is despite there being a 301 redirect from it to the .com page. Any ideas on what is going on here?
White Hat / Black Hat SEO | | ipancake0 -
Google admits it can take up to a year to refresh/recover your site after it is revoked from Penguin!
I found myself in an impossible situation where I was getting information from various people that seem to be "know it all's" but everything in my heart was telling me they were wrong when it came to the issues my site was having. I have been on a few Google Webmaster Hangouts and found many answers to questions I thought had caused my Penguin Penalty. After taking much of the advice, I submitted my Reconsideration Request for the 9th time (might have been more) and finally got the "revoke" I was waiting for on the 28th of MAY. What was frustrating was on May 22nd there was a Penguin refresh. This as far as I knew was what was needed to get your site back up in the organic SERPS. My Disavow had been submitted in February and only had a handful of links missing between this time and the time we received the revoke. We patiently waited for the next penguin refresh with the surety that we were heading in the right direction by John Mueller from Google (btw.. John is a great guy and really tries to help where he can). The next update came on October 4th and our rankings actually got worse! I spoke with John and he was a little surprised but did not go into any detail. At this point you have to start to wonder WHAT exactly is wrong with the website. Is this where I should rank? Is there a much deeper Panda issue. We were on the verge of removing almost all content from the site or even changing domains despite the fact that it was our brand name. I then created a tool that checked the dates of every last cached date of each link we had in our disavow file. The thought process was that Google had not re-crawled all the links and so they were not factored into the last refresh. This proved to be incorrect,all the links had been re-cached August and September. Nothing earlier than that,which would indicate a problem that they had not been cached in time. i spoke to many so called experts who all said the issue was that we had very few good links left,content issues etc.. Blah Blah Blah, heard it all before and been in this game since the late 90's, the site could not rank this badly unless there was an actual penalty as spam site ranked above us for most of our keywords. So just as we were about to demolish the site I asked John Mueller one more time if he could take a look at the site, this time he actually took the time to investigate,which was very kind of him. he came back to me in a Google Hangout in late December, what he said to me was both disturbing and a relief at the same time. the site STILL had a penguin penalty despite the disavow file being submitted in February over 10 months ago! And the revoke in May. I wrote this to give everyone here that has an authoritative site or just an old one, hope that not all is lots just yet if you are still waiting to recover in Google. My site is 10 years old and is one of the leaders in its industry. Sites that are only a few years old and have had unnatural link building penalties have recovered much faster in this industry which I find ridiculous as most of the time the older authoritative sites are the big trustworthy brands. This explains why Google SERPS have been so poor for the last year. The big sites take much longer to recover from penalties letting the smaller lest trustworthy sites prevail. I hope to see my site recover in the next Penguin refresh with the comfort of knowing that my site currently is still being held back by the Google Penguin Penalty refresh situation. Please feel free to comment below on anything you think is relevant.
White Hat / Black Hat SEO | | gazzerman10 -
The purpose of these Algo updates: To more harshly push eCommerce sites toward PPC and enable normal blogs/forums toward reclaiming organic search positions?
Hi everyone, This is my first post here, and absolutely loving the site and the services. Just a quick background, I have dabbled in SEO in the past, and have been reading up over the last few months and am amazed at the speed at which things are changing. I currently have a few clients that I am doing some SEO work for 2 of them, and have had an ecommerce site enquire about SEO services. They are a medium sized oak furniture ecommerce site. From all the major changes..the devaluing of spam links, link networks, penalization of overuse of exact match anchor text and the overall encouraging of earned links (often via content marketing) over built links, adding to this the (not provided) section in Google Analytics, and the increasing screen real estate that PPC is getting over organic search...all points to me thinking on major thing..... That the search engine is trying to push eCommerce sites and sites that sell stuff harder toward using PPC and paid advertising and allowing the blogs/forums and informational sites to more easily reclaim the organic part of the search results again. The above is elaborated on a bit more below.. POINT 1 Firstly as built links (article submission, press releases, info graphic submission, web 2.0 link building ect) rapidly lose their effectiveness, and as Google starts to place more emphasis on sites earning links instead - by producing amazing interesting and unique content that people want to link to. The fact remains that surely Google is aware that it is much harder for eCommerce sites to produce a constant stream of interesting link worthy content around their niche (especially if its a niche that not an awful lot could be written about). Although earning links is not impossible for eCommerce sites, for a lot of them it is more difficult because creating link worthy content is not what eCommerce sites were originally intended for. Whereas standard blogs and forums were built for that exact purpose. Therefore the search engines must know that it is a lot easier for normal blogs/forums to "earn" links through content, therefore leading to them reclaiming more of the organic search ranking for transaction and non transaction terms, and therefore forcing the eCommerce sites to adopt PPC more heavily. POINT 2 If we add to the mix the fact that for the terms most relevant to eCommerce sites, the search engine results page has a larger allocation of PPC ads than organic results (above the fold), and that Google has limited the amount of data that sites can see in terms of which keywords people are using to arrive on their sites, which effects eCommerce sites more - as it makes it harder for them to see which keywords are resulting in sales. Then this provides further evidence that Google is trying to back eCommerce sites into a corner by making it more difficult for them to make sense of and track sales from organic results in comparison to with PPC, where data is still plentiful. Conclusion Are the above just over exaggerations? Can most eCommerce sites still keep achieving a good percentage of sales from organic search despite the above? if so, what do the more niche eCommerce sites do to "earn" links when content topics are thin and unique outreach destinations can be exhausted quickly. Do they accept the fact that the are in the business of selling things, so should be paying for their traffic as opposed to normal blogs/forums which are not. Or is there still a place for them to get even more creative with content and acquire earned links..? And finally, is the concentration on earned links more overplayed than it actually is? Id really appreciate your thoughts on this..
White Hat / Black Hat SEO | | sanj50500 -
Removing/ Redirecting bad URL's from main domain
Our users create content for which we host on a seperate URL for a web version. Originally this was hosted on our main domain. This was causing problems because Google was seeing all these different types of content on our main domain. The page content was all over the place and (we think) may have harmed our main domain reputation. About a month ago, we added a robots.txt to block those URL's in that particular folder, so that Google doesn't crawl those pages and ignores it in the SERP. We now went a step further and are now redirecting (301 redirect) all those user created URL's to a totally brand new domain (not affiliated with our brand or main domain). This should have been done from the beginning, but it wasn't. Any suggestions on how can we remove all those original URL's and make Google see them as not affiliated with main domain?? or should we just give it the good ol' time recipe for it to fix itself??
White Hat / Black Hat SEO | | redcappi0 -
Redirects/What to do with multi domains for the same company?
What is the correct way to "redirect" a domain if you have multi domain names for the same site? For example if a company has www.mysite.com www.mysite.info www.mysite.tv www.mysite+location.com Say my website lived at this location www.mysite.com would I then just forward the other domains to the same place? Do search engines penilize for this? Do search engines view this as duplicated content? Is it even worth having these domains and making the active? Thanks in advance!
White Hat / Black Hat SEO | | christinarule0