Controlling crawl speed/delay through dynamic server-code and 503's
-
Lately i'm experiencing performance trouble caused by bot traffic. Although Googlebot is not the worst (it's mainly bingbot and ahrefsbot), they cause heavy server load from time to time. We run a lot of sites on one server, so heavy traffic on one site impacts other site's performance.
Problem is that 1) I want a centrally managed solution for all sites (per site administration takes too much time), which 2) takes into account total server-load in stead of only 1 site's traffic and 3) controls overall bot-traffic in stead of controlling traffic for one bot. IMO user-traffic should always be prioritized higher than bot-traffic.
I tried "Crawl-delay:" in robots.txt, but Googlebot doesn't support that. Although my custom CMS system has a solution to centrally manage Robots.txt for all sites at once, it is read by bots per site and per bot, so it doesn't solve 2) and 3).
I also tried controlling crawl-speed through Google Webmaster Tools, which works, but again it only controls Googlebot (and not other bots) and is administered per site. No solution to all three of my problems.
Now i came up with a custom-coded solution to dynamically serve 503 http status codes to a certain portion of the bot traffic. What traffic-portion for which bots can be dynamically (runtime) calculated from total server load at that certain moment. So if a bot makes too much requests within a certain period (or whatever other coded rule i'll invent), some requests will be answered with a 503 while others will get content and a 200.
Remaining question is: Will dynamically serving 503's have a negative impact on SEO? OK, it will delay indexing speed/latency, but slow server-response-times do in fact have a negative impact on the ranking, which is even worse than indexing-latency.
I'm curious about your expert's opinions...
-
Hi INU,
I always like avoid using things like 503s as a general rule. There is almost certainly a better way to do it.
What about just using Google webmaster tools and Bing webmaster tools? Regarding HREFs it depends how much you rely on that tool. If you don't use it, then I'd more more likely to just block that bot in robots.txt and make sure Google and Bing are controlled using the appropriate tools in the respective webmaster tools.
To answer your specific point about whether or not 503 can hurt rankings. In general no as long as they are only short-term. A 503 like 404s or any other response code is a natural part of the web, however, Google has said in the past that repetitive 503s can be treated as permanent rather than temporary and in some cases can result in the pages being removed from the index.
I hope this helps,
Craig
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why My Website's Rank still in Millions
I am getting enough Traffic on my website on best weed killer on affiliate but Moz still showing its Rank in millions. What would be the best strategy to improve the rankings.???
White Hat / Black Hat SEO | | sarahelen0 -
Changing Links to Spans with Robots.txt Blocked Redirects using Linkify/jQuery
Hi, I was recently penalized most likely because Google started following javascript links to bad neighborhoods that were not no-followed. The first thing I did was remove the Linkify plugin from my site so that all those links would disappear, but now I think I have a solution that works with Linkify without creating crawlable links. I did the following: I blocked access to the Linkify scripts using robots.txt so that Google won't execute the scripts that create the links. This has worked for me in the past with banner ads linking to other sites of mine. At least it appears to work because those sites did not get links from pages running those banners in search console. I created a /redirect/ directory that redirects all offsite URLs. I put a robots.txt block on this directory. I configured the Linkify plugin to parse URLs into span elements instead of a elements and add no follow attributes. They still have an href attribute, but the URLs in the href now point to the redirect directory and the span onclick event redirects the user. I have implemented this solution on another site of mine and I am hoping this will make it impossible for Google to categorize my pages as liking to any neighborhoods good or bad. Most of the content is UGC, so this should discourage link spam while giving users clickable URLs and still letting people post complaints about people that have profiles on adult websites. Here is a page where the solution has been implemented https://cyberbullyingreport.com/bully/predators-watch-owner-scott-breitenstein-of-dayton-ohio-5463.aspx, the Linkify plugin can be found at https://soapbox.github.io/linkifyjs/, and the custom jQuery is as follows: jQuery(document).ready(function ($) { 2 $('p').linkify({ tagName: 'span', attributes: { rel: 'nofollow' }, formatHref: function (href) { href = 'https://cyberbullyingreport.com/redirect/?url=' + href; return href; }, events:{ click: function (e) { var href = $(this).attr('href'); window.location.href = href; } } }); 3 });
White Hat / Black Hat SEO | | STDCarriers0 -
It's possible a bounce-rate attack manipulate SEO?
My site has been visited by unusual users with one second session times. This leaves my analytics data confused.
White Hat / Black Hat SEO | | CompraBit0 -
Server down - What will happen to the SERP?
Hi everybody, we have a lot of websites (about 100) on one Server in Italy. This Server crashed 5 days ago and now it should go online (I hope!). What will happen to the SERP? What shall I do to recover the rank of every key? New links, new content, just wait...what? Tnks 😉
White Hat / Black Hat SEO | | Sognando0 -
LOCAL SEO / Ranking for the difficult 'service areas' outside of the primary location?
It's generally not too hard to rank in Google Places and organically for your primary location. However if you are a service area business looking to rank for neighboring cities or service areas, Google makes this much tougher. Andrew Shotland mentions the obvious and not so obvious options: Service Area pages ranking organically, getting a real/virtual address, boost geo signals, and using zip codes instead of service area circle. But I am wondering if anyone had success with other methods? Maybe you have used geo-tagging in a creative way? This is a hurdle that many local business are struggling with and any experience or thoughts will be much appreciated
White Hat / Black Hat SEO | | vmialik1 -
Some pages of my website http://goo.gl/1vGZv stopped crawling in Google
hi , i have 5 years old website and some page of my website http://goo.gl/1vGZv stopped indexing in Google . I have asked Google webmaster to remove low quality link via disavow tool . What to do ?
White Hat / Black Hat SEO | | unitedworld0