Controlling crawl speed/delay through dynamic server-code and 503's
-
Lately i'm experiencing performance trouble caused by bot traffic. Although Googlebot is not the worst (it's mainly bingbot and ahrefsbot), they cause heavy server load from time to time. We run a lot of sites on one server, so heavy traffic on one site impacts other site's performance.
Problem is that 1) I want a centrally managed solution for all sites (per site administration takes too much time), which 2) takes into account total server-load in stead of only 1 site's traffic and 3) controls overall bot-traffic in stead of controlling traffic for one bot. IMO user-traffic should always be prioritized higher than bot-traffic.
I tried "Crawl-delay:" in robots.txt, but Googlebot doesn't support that. Although my custom CMS system has a solution to centrally manage Robots.txt for all sites at once, it is read by bots per site and per bot, so it doesn't solve 2) and 3).
I also tried controlling crawl-speed through Google Webmaster Tools, which works, but again it only controls Googlebot (and not other bots) and is administered per site. No solution to all three of my problems.
Now i came up with a custom-coded solution to dynamically serve 503 http status codes to a certain portion of the bot traffic. What traffic-portion for which bots can be dynamically (runtime) calculated from total server load at that certain moment. So if a bot makes too much requests within a certain period (or whatever other coded rule i'll invent), some requests will be answered with a 503 while others will get content and a 200.
Remaining question is: Will dynamically serving 503's have a negative impact on SEO? OK, it will delay indexing speed/latency, but slow server-response-times do in fact have a negative impact on the ranking, which is even worse than indexing-latency.
I'm curious about your expert's opinions...
-
Hi INU,
I always like avoid using things like 503s as a general rule. There is almost certainly a better way to do it.
What about just using Google webmaster tools and Bing webmaster tools? Regarding HREFs it depends how much you rely on that tool. If you don't use it, then I'd more more likely to just block that bot in robots.txt and make sure Google and Bing are controlled using the appropriate tools in the respective webmaster tools.
To answer your specific point about whether or not 503 can hurt rankings. In general no as long as they are only short-term. A 503 like 404s or any other response code is a natural part of the web, however, Google has said in the past that repetitive 503s can be treated as permanent rather than temporary and in some cases can result in the pages being removed from the index.
I hope this helps,
Craig
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Changing Links to Spans with Robots.txt Blocked Redirects using Linkify/jQuery
Hi, I was recently penalized most likely because Google started following javascript links to bad neighborhoods that were not no-followed. The first thing I did was remove the Linkify plugin from my site so that all those links would disappear, but now I think I have a solution that works with Linkify without creating crawlable links. I did the following: I blocked access to the Linkify scripts using robots.txt so that Google won't execute the scripts that create the links. This has worked for me in the past with banner ads linking to other sites of mine. At least it appears to work because those sites did not get links from pages running those banners in search console. I created a /redirect/ directory that redirects all offsite URLs. I put a robots.txt block on this directory. I configured the Linkify plugin to parse URLs into span elements instead of a elements and add no follow attributes. They still have an href attribute, but the URLs in the href now point to the redirect directory and the span onclick event redirects the user. I have implemented this solution on another site of mine and I am hoping this will make it impossible for Google to categorize my pages as liking to any neighborhoods good or bad. Most of the content is UGC, so this should discourage link spam while giving users clickable URLs and still letting people post complaints about people that have profiles on adult websites. Here is a page where the solution has been implemented https://cyberbullyingreport.com/bully/predators-watch-owner-scott-breitenstein-of-dayton-ohio-5463.aspx, the Linkify plugin can be found at https://soapbox.github.io/linkifyjs/, and the custom jQuery is as follows: jQuery(document).ready(function ($) { 2 $('p').linkify({ tagName: 'span', attributes: { rel: 'nofollow' }, formatHref: function (href) { href = 'https://cyberbullyingreport.com/redirect/?url=' + href; return href; }, events:{ click: function (e) { var href = $(this).attr('href'); window.location.href = href; } } }); 3 });
White Hat / Black Hat SEO | | STDCarriers0 -
I'm seeing thousands of no-follow links on spam sites. Can you help figure it out?
I noticed that we are receiving thousands of links from many different sites that are obviously disguised as something else. The strange part is that some of them are legitimate sites when you go to the root. I would say 99% of the page titles read something like : 1 Hour Loan Approval No Credit Check Vermont, go cash advance - africanamericanadaa.com. Can someone please help me? Here are some of the URL's we are looking at: http://africanamericanadaa.com/genialt/100-dollar-loans-for-people-with-no-credit-colorado.html http://muratmakara.com/sickn/index.php?recipe-for-cone-06-crackle-glaze http://semtechblog.com/tacoa/index.php?chilis-blue-raspberry-margarita http://wesleygcook.com/rearc/guaranteed-personal-loans-oregon.html
White Hat / Black Hat SEO | | TicketCity0 -
Unique meta descriptions for 2/3 of it, but then identical ending?
I'm working on an eCommerce site and had a question about my meta descriptions. I'm creating unique meta descriptions for each category and subcategory, but I'm thinking of adding the same ending to it. For example: "Unique descriptions, blah blah blah. Free Overnight Shipping..". So the "Free Overnight Shipping..." ending would be on all the categories. It's an ongoing promo so I feel it's important to add and attract buyers, but don't want to screw up with duplicate content. Any suggestions? Thanks for your feedback!
White Hat / Black Hat SEO | | jeffbstratton0 -
Creating pages as exact match URL's - good or over-optimization indicator?
We all know that exact match domains are not getting the same results in the SERP's with the algo changes Google's been pushing through. Does anyone have any experience or know if that also applies to having an exact match URL page (not domain). Example:
White Hat / Black Hat SEO | | lidush
keyword: cars that start with A Which way to go is better when creating your pages on a non-exact domain match site: www.sample.com/cars-that-start-with-a/ that has "cars that start with A" as the or www.sample.com/starts-with-a/ again has "cars that start with A" as the Keep in mind that you'll add more pages that start the exact same way as you want to cover all the letters in the alphabet. So: www.sample.com/cars-that-start-with-a/
www.sample.com/cars-that-start-with-b/
www.sample.com/cars-that-start-with-C/ or www.sample.com/starts-with-a/
www.sample.com/starts-with-b/
www.sample.com/starts-with-c/ Hope someone here at the MOZ community can help out. Thanks so much0 -
Google admits it can take up to a year to refresh/recover your site after it is revoked from Penguin!
I found myself in an impossible situation where I was getting information from various people that seem to be "know it all's" but everything in my heart was telling me they were wrong when it came to the issues my site was having. I have been on a few Google Webmaster Hangouts and found many answers to questions I thought had caused my Penguin Penalty. After taking much of the advice, I submitted my Reconsideration Request for the 9th time (might have been more) and finally got the "revoke" I was waiting for on the 28th of MAY. What was frustrating was on May 22nd there was a Penguin refresh. This as far as I knew was what was needed to get your site back up in the organic SERPS. My Disavow had been submitted in February and only had a handful of links missing between this time and the time we received the revoke. We patiently waited for the next penguin refresh with the surety that we were heading in the right direction by John Mueller from Google (btw.. John is a great guy and really tries to help where he can). The next update came on October 4th and our rankings actually got worse! I spoke with John and he was a little surprised but did not go into any detail. At this point you have to start to wonder WHAT exactly is wrong with the website. Is this where I should rank? Is there a much deeper Panda issue. We were on the verge of removing almost all content from the site or even changing domains despite the fact that it was our brand name. I then created a tool that checked the dates of every last cached date of each link we had in our disavow file. The thought process was that Google had not re-crawled all the links and so they were not factored into the last refresh. This proved to be incorrect,all the links had been re-cached August and September. Nothing earlier than that,which would indicate a problem that they had not been cached in time. i spoke to many so called experts who all said the issue was that we had very few good links left,content issues etc.. Blah Blah Blah, heard it all before and been in this game since the late 90's, the site could not rank this badly unless there was an actual penalty as spam site ranked above us for most of our keywords. So just as we were about to demolish the site I asked John Mueller one more time if he could take a look at the site, this time he actually took the time to investigate,which was very kind of him. he came back to me in a Google Hangout in late December, what he said to me was both disturbing and a relief at the same time. the site STILL had a penguin penalty despite the disavow file being submitted in February over 10 months ago! And the revoke in May. I wrote this to give everyone here that has an authoritative site or just an old one, hope that not all is lots just yet if you are still waiting to recover in Google. My site is 10 years old and is one of the leaders in its industry. Sites that are only a few years old and have had unnatural link building penalties have recovered much faster in this industry which I find ridiculous as most of the time the older authoritative sites are the big trustworthy brands. This explains why Google SERPS have been so poor for the last year. The big sites take much longer to recover from penalties letting the smaller lest trustworthy sites prevail. I hope to see my site recover in the next Penguin refresh with the comfort of knowing that my site currently is still being held back by the Google Penguin Penalty refresh situation. Please feel free to comment below on anything you think is relevant.
White Hat / Black Hat SEO | | gazzerman10 -
What to do if you've been hacked.....
Just logged into our CMS system and it appears we have been hacked. All page titles have been hijacked adding a secondary title tag linking out to website http://emapaydayloans.com with anchor text pay day loans. Our Web Dev team are working on fixing the hack now. My concern is the potential knock on effect to SEO. This looks like a bad neighbourhood site: 3 pages indexed PR 0 And for I don't know how long we've had almost every page on all our domains linking out with the following page title including the same link and anchor text: payday loans I assume its a wait and see at this stage.
White Hat / Black Hat SEO | | RobertChapman0 -
Dramatic fall in SERP's for all keywords at end of March 2012?? Help!
Hi, Our website www.photoworld.co.uk has been improving it's SERP's for the last 12 months or so, achieving page 1 rankings for most of our key terms. Then suddenly, around the end of March, we suffered massive drops in nearly all of our key terms (see attached image for more info). Basically I wondered if anyone had any clues on what Google has suddenly taken a huge dislike to with our site and steps we can put in place to aid with rankings recovery ASAP. Thanks n8taO.jpg
White Hat / Black Hat SEO | | cewe0 -
Are paid reviews gray/black hat?
Are sites like ReviewMe or PayPerPost white hat? Are follow links allowed within the post? Should I use those aforementioned services, or cold contact high authority sites within my niche?
White Hat / Black Hat SEO | | 10JQKAs0