Controlling crawl speed/delay through dynamic server-code and 503's
-
Lately i'm experiencing performance trouble caused by bot traffic. Although Googlebot is not the worst (it's mainly bingbot and ahrefsbot), they cause heavy server load from time to time. We run a lot of sites on one server, so heavy traffic on one site impacts other site's performance.
Problem is that 1) I want a centrally managed solution for all sites (per site administration takes too much time), which 2) takes into account total server-load in stead of only 1 site's traffic and 3) controls overall bot-traffic in stead of controlling traffic for one bot. IMO user-traffic should always be prioritized higher than bot-traffic.
I tried "Crawl-delay:" in robots.txt, but Googlebot doesn't support that. Although my custom CMS system has a solution to centrally manage Robots.txt for all sites at once, it is read by bots per site and per bot, so it doesn't solve 2) and 3).
I also tried controlling crawl-speed through Google Webmaster Tools, which works, but again it only controls Googlebot (and not other bots) and is administered per site. No solution to all three of my problems.
Now i came up with a custom-coded solution to dynamically serve 503 http status codes to a certain portion of the bot traffic. What traffic-portion for which bots can be dynamically (runtime) calculated from total server load at that certain moment. So if a bot makes too much requests within a certain period (or whatever other coded rule i'll invent), some requests will be answered with a 503 while others will get content and a 200.
Remaining question is: Will dynamically serving 503's have a negative impact on SEO? OK, it will delay indexing speed/latency, but slow server-response-times do in fact have a negative impact on the ranking, which is even worse than indexing-latency.
I'm curious about your expert's opinions...
-
Hi INU,
I always like avoid using things like 503s as a general rule. There is almost certainly a better way to do it.
What about just using Google webmaster tools and Bing webmaster tools? Regarding HREFs it depends how much you rely on that tool. If you don't use it, then I'd more more likely to just block that bot in robots.txt and make sure Google and Bing are controlled using the appropriate tools in the respective webmaster tools.
To answer your specific point about whether or not 503 can hurt rankings. In general no as long as they are only short-term. A 503 like 404s or any other response code is a natural part of the web, however, Google has said in the past that repetitive 503s can be treated as permanent rather than temporary and in some cases can result in the pages being removed from the index.
I hope this helps,
Craig
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Mobile Redirect - Cloaking/Sneaky?
Question since Google is somewhat vague on what they consider mobile "equivalent" content. This is the hand we're dealt with due to budget, no m.dot, etc, responsive/dynamic is on the roadmap but still a couple quarters away but, for now, here's the situation. We have two sets of content and experiences, one for desktop and one for mobile. The problem is that desktop content does not = mobile content. The layout, user experience, images and copy aren't the same across both versions - they are not dramatically different but not identical. In many cases, no mobile equivalent exists. Dev wants to redirect visitors who find the desktop version in mobile search to the equivalent mobile experience, when it exists, when it doesn't they want to redirect to the mobile homepage - which really isn't a homepage it's an unfiltered view of the content. Yeah we have push state in place for the mobile version etc. My concern is that Google will look at this as cloaking, maybe not in the cases where there's a near equivalent piece of content, but definitely when we're redirecting to the "homepage". Not to mention this isn't a great user experience and will impact conversion/engagement metrics which are likely factors Google's algorithm considers. What's the MOZ Community say about this? Cloaking or Not and Why? Thanks!
White Hat / Black Hat SEO | | Jose_R0 -
I'm Getting Attacked, What Can I Do?
I recently noticed a jump in my Crawl Errors in Google Webmaster Tools. Upon further investigation I found hundreds of the most spammy web pages I've ever seen pointing to my domain (although all going to 404 errors): http://blurchelsanog1980.blog.com/ http://lenitsky.wordpress.com/ These are all created within the last week. A. What the hell is going on? B. Should I be very concerned? (because they are 404 errors) C. What should my next steps be? Any help would be greatly appreciated.
White Hat / Black Hat SEO | | CleanEdisonInc0 -
Traffic Generation Visitor Exchange Systems & Google Algo / Punihsments
So, in recent years some services have been developed such as Engageya I want to ask the experts to weigh in on these types of services that generate traffic. I know of sites that have achieved higher ranking via these NON-bot, user browser visitors. Here's their own explanation. Any thoughts will be appreciated. I could not find what Google's Matt Cutts has to say about these affairs, I suspect not very good things. However, I KNOW of sites that have achieved higher ranking, with about 30-40% of traffic coming from similar systems to this. Join our exclusive readers exchange ecosystem Engageya offers an exclusive readers exchange ecosystem - either within the network only, or cross-networks as well - enabling participating publishers to exchange engaged readers between them in a 1:1 exchange ratio. No commissions involved! Why networks work with Engageya? Create traffic circulation within your network - increase your inventory and impressions within your existing properties.Engage readers within your network and experience an immediate increase in network's page views. Enjoy readers'- exchange from other networksOur engine intelligently links matching content articles together, from within your network, as well as from other networks. Get new audiences to your network for non-converting users clicking out. New revenue channel - monetize pages with reader-friendly content ad units, while making your readers happy!This is the time to move from aggressive and underperforming monetization methods - to effective and reader-friendly content advertising.
White Hat / Black Hat SEO | | Ripe
Let our state-of-the-art semantic & behavioral algorithms place quality targeted content ads on your publisher's content pages. Enjoy highest CTRs in the industryContent ads are proven to yield the highest CTRs in the industry, starting at 2% and up to 12% click-through rates! This is simple. Readers click on an article they are interested-in, whether it's sponsored or not. Enhance your brand - Offer your publishers private-label content recommendations today, before someone else does.Content advertising is becoming more and more common. New content advertising networks and suppliers are being introduced into the online advertising market, and, sooner or later, they are going to approach your publishers. Engageya offers you a private-label platform to offer your publishers the new & engaging content ad unit - today! Comprehensive reports and traffic control dashboardTrace the effectiveness of the content recommendations ad units, as well as control the traffic within your network.0 -
Website Hacked now it's not Ranking
One of my domains was hacked right before I took over managing it. The hacker created around 100 links for simply grotesque things. After I took over I erased the entire site, rebuilt from scratch, new server (inmotion), rewrote every page, robots.txt every offending page, and even 301 just in case 404s were hurting me. I am now almost a month in and I have seen zero movement on anything rankings based. This is not a bad domain it was registered in 2008 and has a few decent citations because of the Doc's medical license. They registered for BBB in November and have a 30 year old listing citation from them based on business establishment. I must be going crazy but it's not ranking for anything except the homepage. I didn't know Google could hold a grudge for so long. The only ranking I can sometimes achieve is through Google Places which still has to compete with tough domains. I've already put in a reconsideration request and received a response stating the following: We reviewed your site and found no manual actions by the webspam team that might affect your site's ranking in Google. There's no need to file a reconsideration request for your site, because any ranking issues you may be experiencing are not related to a manual action taken by the webspam team. Just check it for yourself I know it's a work in progress but I'm not even considered relevant on page 50! And the crap links are still indexed!! A search for a keyword I'm aiming for with my client's name followed after gives me no results. I am currently using wordpress, yoast xml, and single keyword focusses. My market is tough but no way I can not rank for the keyword and my name.
White Hat / Black Hat SEO | | allenrocks0 -
External links without unnatural without my control
What should I do with links that Google considers link unnatural, but I have no control over them?
White Hat / Black Hat SEO | | soulmktpro0 -
So what's up with UpDowner.com?
I've noticed these guys in link profiles for several sites I manage. They'll usually show up around 1,000-10,000 times in the backlink profile. From what I can tell they index websites, build up keyword relationships, and then when you search for something on their site (e.g. poker) they'll present a list of related sites with stats about them. The stats seem to be yanked straight from Alexa. Where the backlink comes from is that every time 'your' site shows up for a search result they'll put a little iframe that contains your site. This means if your site's name/keywords are pretty broad, you could be showing up thousands and tens of thousands of times as being linked from these guys on their pages that Google indexes. And Google indexes, boy do they ever. At the height, they had over 53 million pages indexed. That has apparently shrunk now to around 25 million. I believe their strategy is to generate a crap-load of automated content in the hopes they can cash in on obscure long tails. So my questions for you guys are: Are you seeing them in your backlinks too? Should I block their spider/referrers? What is their deal man?
White Hat / Black Hat SEO | | icecarats0 -
Attracta.com / "weekly submissions to top 100 search engines"
I recently received an offer from Attracta.com because I have a hostgator account. They are offering different levels of service for submitting xml sitemaps on a weekly basis. Is this a good idea? Thanks for your feedback! Will PS see graphic: Screen%20Shot%202012-02-08%20at%2010.06.56%20PM.png
White Hat / Black Hat SEO | | WillWatrous0 -
Beaten in SERP's by a site going 'all in' on 2 keywords in their anchor text profile.
I would like to get peoples thoughts on putting 80% of your anchor text links in just 2 keywords vs a nice spread of branded and longtail keywords.. like I am. recently fell off the first page for a key SERP.. and the site in P10 has gone nuts on just that two keyword's.. I know we have a good site onpage/ conversion / low bounce rate page views etc.. Pretty sure we get more traffic than them. Seems that this obvious bloated anchor text profiling has worked for them though.. What do you guys think/know?
White Hat / Black Hat SEO | | robertrRSwalters0