Controlling crawl speed/delay through dynamic server-code and 503's
-
Lately i'm experiencing performance trouble caused by bot traffic. Although Googlebot is not the worst (it's mainly bingbot and ahrefsbot), they cause heavy server load from time to time. We run a lot of sites on one server, so heavy traffic on one site impacts other site's performance.
Problem is that 1) I want a centrally managed solution for all sites (per site administration takes too much time), which 2) takes into account total server-load in stead of only 1 site's traffic and 3) controls overall bot-traffic in stead of controlling traffic for one bot. IMO user-traffic should always be prioritized higher than bot-traffic.
I tried "Crawl-delay:" in robots.txt, but Googlebot doesn't support that. Although my custom CMS system has a solution to centrally manage Robots.txt for all sites at once, it is read by bots per site and per bot, so it doesn't solve 2) and 3).
I also tried controlling crawl-speed through Google Webmaster Tools, which works, but again it only controls Googlebot (and not other bots) and is administered per site. No solution to all three of my problems.
Now i came up with a custom-coded solution to dynamically serve 503 http status codes to a certain portion of the bot traffic. What traffic-portion for which bots can be dynamically (runtime) calculated from total server load at that certain moment. So if a bot makes too much requests within a certain period (or whatever other coded rule i'll invent), some requests will be answered with a 503 while others will get content and a 200.
Remaining question is: Will dynamically serving 503's have a negative impact on SEO? OK, it will delay indexing speed/latency, but slow server-response-times do in fact have a negative impact on the ranking, which is even worse than indexing-latency.
I'm curious about your expert's opinions...
-
Hi INU,
I always like avoid using things like 503s as a general rule. There is almost certainly a better way to do it.
What about just using Google webmaster tools and Bing webmaster tools? Regarding HREFs it depends how much you rely on that tool. If you don't use it, then I'd more more likely to just block that bot in robots.txt and make sure Google and Bing are controlled using the appropriate tools in the respective webmaster tools.
To answer your specific point about whether or not 503 can hurt rankings. In general no as long as they are only short-term. A 503 like 404s or any other response code is a natural part of the web, however, Google has said in the past that repetitive 503s can be treated as permanent rather than temporary and in some cases can result in the pages being removed from the index.
I hope this helps,
Craig
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why My Website's Rank still in Millions
I am getting enough Traffic on my website on best weed killer on affiliate but Moz still showing its Rank in millions. What would be the best strategy to improve the rankings.???
White Hat / Black Hat SEO | | sarahelen0 -
Server and multiple sites
We have multiple sites selling similar products in different ways but have always kept them separate on the off chance that google does not like it or they penalize one site. We have always put them on different servers but now thinking for performance as they are on shared hosting to put them on a single server which would be our own but we do not know the SEO considerations. We can assign multiple IPs to a server but I am not 100% sure whether there is still a negative impact of running multiple sites on the same server even if from a different IP. Any help would be appreciated, what I am really asking is could if they are on the same server with different IP's be still linked together by google?
White Hat / Black Hat SEO | | BobAnderson0 -
HTTPS/SSL and Backlinks
I am planning on moving my site to HTTPS. I brought an expired domain and wondering if moving to HTTPS will affects the previous backlinks? Will I need to do a redirect? Will I lose any link juice? Thanks
White Hat / Black Hat SEO | | wspence150 -
Why isn't Moz recognizing meta description tags using SLIM?
Hey All, I keep getting reports from Moz that many of my pages are missing meta description tags. We use SLIM for our website, and I'm wondering if anyone else has had the same issue getting Moz to recognize that the meta descriptions exist. We have a default layout that we incorporate into every page on our site. In the head of that layout, we've included our meta description parameters: meta description ='#{current_page.data.description}' Then each page has its own description, which is recognized by the source code http://fast.customer.io/s/viewsourcelocalhost4567_20140519_154013_20140519_154149.png Any ideas why Moz still isn't recognizing that we have meta descriptions? -Nora, Customer.io
White Hat / Black Hat SEO | | sudonim0 -
How/why is this page allowed to get away with this?
I was doing some research on a competitor's backlinks in Open Site Explorer and I noticed that their most powerful link was coming from this page: http://nytm.org/made-in-nyc. I visited that page and found that this page, carrying a PageRank of 7, is just a long list of followed links. That's literally all that's on the entire page - 618 links. Zero nofollow tags. PR7. On top of that, there's a link at the top right corner that says "Want to Join?" which shows requirements to get your link on that page. One of these is to create a reciprocal link from your site back to theirs. I'm one of those white-hat SEOs who actually listens to Matt Cutts, and the more recent stuff from Moz. This entire page basically goes against everything I've been reading over the past couple years about how reciprocal links are bad, and if you're gonna do it, use a nofollow tag. I've read that pages, or directories, such as these are being penalized by Google, and possible the websites with links to the page could be penalized as well. I've read that exact websites such as these are getting deindexed by the bunches over the past couple years. My real question is how is this page allowed to get away with this? And how are they rewarded with such high PageRank? There's zero content aside from 618 links, all followed. Is this just a case of "Google just hasn't gotten around to finding and penalizing this site yet" or am I just naive enough to actually listen and believe anything that comes out of Matt Cutts videos?
White Hat / Black Hat SEO | | Millermore0 -
Will implementing 301's on an existing domain impact massively on rankings?
Hi Guys,I have a new SEO client who only has the non-www domain setup for GWT and I am wondering if implementing a 301 for www will have a massive negative impact on rankings. I know a percentage of link juice and PageRank will be affected. So my question is: If I implement the 301 should I brace myself for a fall in rankings. Should I use a 301 instead to maintain link juice and PageRank? Is it good practice to forward to www? Or could I leave the non www in place and have the www redirect to it to maintain the data? Dave
White Hat / Black Hat SEO | | icanseeu0 -
Herbal Viagra page same DA/PA as UC Berkeley??
Either there is some amazingly good SEO work going on here, or Google has an amazingly large hole in their metrics. http://nottowait.com/ http://www.ucdavis.edu/index.html The "nottowait" page has a PA of 85?! and a DA of 82?! The page is HORRIBLE. The page itself is an image of another page. The nav bar does not function, nor does any of the "click here" links. At the bottom there is a paragraph of keywords and broken english. This page is pure junk and should simply not have any value at all with respect to DA nor PA. It has a ton of incoming links from various sources which seem to be the source of all this value, which it passes on to other pages. This page really is an affront to the "content is king" concept. I suppose I should ask a question but all I can think of is, what is Matt Cutts' phone number? I want to ask him how this page has gotten away with being ranked so well for so long.
White Hat / Black Hat SEO | | RyanKent0