Need Third Party Input. Our Web host blocked all bots including Google and myself because they believe SEO is slowing down their server.
-
I would like some third party input... partly for my sanity and also for my client.
I have a client who runs a large online bookstore. The bookstore runs in Magento and the developers are also apparently the web host. (They actually run the servers.. I do not know if they are sitting under someones desk or are actually in a data center)
Their server has been slowed down by local and foreign bots. They are under the impression my SEO services are sending spammer bots to crawl and slow down their site.
To fix the problem they disallowed all bots. Everything, Google, Yahoo, Bing. They also banned my access from the site. My clients organic traffic instantly took a HUGE hit. (almost 50% of their traffic is organic and over 50% is Organic + Adwords most everything from Google)
Their keyword rankings are taking a quick dive as well.
Could someone please verify the following as true to help me illustrate to my client that this is completely unacceptable behavior on part of the host.
I believe:
1.) You should never disavow ALL robots from your site as a solution for spam. As a matter of fact most of the bad bots ignore robots.txt anyways. It is a way to limit where Google searches (which is obviously a technique to be used)
2.) On site SEO work as well as link building, etc. is not responsible for foreign bots and scrappers putting a heavy load on the server.
3.) Their behavior will ultimately lead to a massive loss of rankings (already happening) and a huge loss of traffic (already happening) and ultimately since almost half the traffic is organic the client could expect to lose a large sum of revenue from purchases made by organic traffic since it will disappear.
Please give your input and thoughts. I really appreciate it!
-
Thanks so much for your response. Glad to hear that there was a fairly good ending to this, and thanks for following up!
-
Keri -
I was able to produce multiple reports that accomplished the following:
1.) Illustrate the quick/graphic drop in our Google rankings
2.) Illustrate that the majority of traffic comes from Organic Search
3.) Tie together the trend that was already happening of a dramatic drop in organic traffic as keywords were slipping.
4.) Bring a reality to the fact that this behavior will quickly result in such a steep financial hit it was an 'emergency'.
In this particular situation there is My Client, Their Developers, Me the SEO guy.
I found that by helping my client to understand the situation, financial impacts, and why we had to act helped to spur things along.
We had a big meeting with everybody involved. This was a great opportunity to understand what the developers (Who also serve as the host) were going through. Facing attacks from bots, trying to keep the server alive, etc. etc. Ultimately it was revealed that they had a bug in their code for Magento that was causing a lot of extra DB hits that was a main root cause of their issues.
We were able to work out the following ground rules:
1.) NEVER block all bots under any circumstance
2.) The majority of our Organic traffic is Google then Bing/Yahoo were a tiny fraction and everything else didn't matter. I crafted a good robots.txt that let in all of the major bots I wanted and excluded the rest. Ideally I'd like to include most all of them. (Since we're only blocking good bots because the bad ones will just ignore) However, I wanted to compromise and also help them with server traffic. (PLUS for us Google is it.) I did make sure my robots.txt allowed in all Google services, etc.
3.) We set up a system to make sure everybody was in the loop when a dramatic decision regarding the website was made. (that's way better than me finding out a few days later that Google was blocked and damage has already been done)
4.) We really brought into light that SEO has/had nothing to do with the situation.
In the end the developers are great people but like everything else... they almost need to see you in person and hear why they can't do stuff like that. In their world it makes total sense because the server is overloaded. However, there won't be an overloaded server if you block out Google and all the traffic it sends.
We were able to recover most of our rankings and our traffic returned back to normal. We aren't quite back to where we were but getting there. The keywords snapped back fairly quickly but the organic traffic didn't so it might be something else. I actually will throw in a screenshot of the incident down below.
Thanks for checking up on it!
-
Hi Joshua, I'm looking through some older threads, and wondering if you're able to give us any type of update as to what happened in this case (and if you have any hair left!). I've had some battles with developers before too, and have sympathy for your position.
-
Thanks for your answers and help everyone! I really appreciate all of the details. I see the power of this community and hope to be able to contribute instead of only take in the future. Thanks again!!
-
Drop them as a client. They're paying you for SEO help but they obviously don't trust/like it. Not worth your time.
-
Quite clearly, this is bonkers.
If you block access from search engine spiders how can they possibly index the content in good faith? You are hoping that they will not crawl the content to check what is there (or burn server resources) but they will still happily refer users of their search engine to these pages in good faith.
Additionally, it is highly unlikely that bots from the major engines are causing a measurable impact - Google for instance states they will only crawl one page every few seconds (1).
That said, there are a lot of parasites out there and crawlers that will eat up server time so there may also be some truth in what the host is saying. That said, there is still no excuse for this hatchet job of sorting things out.
The other angle here is that magento and ecommerce sites can often be a crawlers worst nightmare. As an example product comparison systems can often create thousands (I have seen millions) of crawlable URLs - now a sensible spider has a crawl budget and will give up but that's not saying all will. A simple crawl in screaming frog should give you an idea here (not that you will be able to do that) and in many cases where these problems exist this is enough to bring a server to it's knees.
In my mind you have a few things to do here
1. Convince the host that blocking all spiders is incorrect
Hopefully this thread and the references here should be more than enough to do this. Beyond that simply show them a fetch as Googlebot & the Crawl section in webmaster tools and you should be able to make your point quickly and easily.
2. Help the developer implement a more sensible list of what to block.
This article is a good start here:
http://searchenginewatch.com/article/2067357/Bye-bye-Crawler-Blocking-the-ParasitesRemember you can allow one (or more) robots and then disallow everything else:
User-agent: Google Disallow: User-agent: * Disallow: /
Other options also exist such as limiting the speed at which a crawler can crawl - well, requesting that they limit the speed at which they crawl.
Also remember that any truly parasitic bot or crawler will likely ignore robots.txt anyway so you may need to implement some more advanced blocking at a firewall or server level.
3. Help the developer identify the cause of the resources problem
As hinted at above, if a crawl is causing problems there are likely issues somewhere. Whether this is as simple as straight up server resources or is more due to problems with the site and crawlable URLs needs to be determined but let me give you some pointers.
- SEO Audit - at least a crawl / indexation audit - lets see how many pages we can crawl? How does this stack up against the amount of products / categories? You may well find some easy wins here and sections of the site that can be blocked off or variables you tell Google not to crawl in webmaster tools. Nofollow directives and URLs can be your friend here as well so you tackle it on both fronts.
- Magento Optimisation - it is easy with a system like Magento to create pages that have a heavy burden on the database with hundreds of queries. If these options are not really used (only by crawlers) then they can be audited and removed / improved.
- Server Resources - Magento can be a hungry beast
- Dig into the http access logs to identify who and what is crawling and from where and come up with a list of what you need to block and how.
Summary
Ultimately, blocking all spiders is daft and there is a good chance it won't resolve the issue anyway - that is unless it screws over the clients search visibility so badly that they don't do any traffic! There are likely issues though be that with the site itself or something else so a good way to couch this to them is as their friend and helper - someone who will help them identify and resolve the issues. If it gets combative then it will only be harder to resolve.
Alternatively, you could move to another host. Part of me would suggest doing this anyway as no host should be able to hold you to ransom like this. This one daft move could have potentially ruined the clients visibility in what is a key time of the year for most online businesses. Imagine if they did not have an SEO on board? If they did not have an automated crawl to highlight these issues?
There is certainly a worthwhile exercise here as the site likely has some problems (or at least areas that can be improved upon) so optimisations can be made but, I would still consider jumping ship and moving to an SEO savvy host in the long term if bridges can't be built.
Hope that helps!
MarcusReferences
-
Yeah, what he said...
And when you call, let them know that it was their slow ssa server that caused you to find another host.
-
I agree with your assessment.
This hosting service is being run by either noobs or stingy people or both.
I would get a new host right away. ASAP. Your rankings in search will die completely if you remain on this host.
In addition to what you have seen here they probably have other practices that are deadly.
I would install my site on new server, then change the DNS before informing the current host. Then call to cuss 'em out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Plugins and SEO
I'd like some expert guidance. I've searched for a theme that does what I want and finally found something I like, but I'm wondering what you all think I should do to increase it's searchability. The plugin has all the listings and styling. All I need to do is past the code into the wordpress site ad voila! I have a page. Using the widget lets me allow upvotes and provide map etc. But it means the content is inside the widget instead of on the page. What would you modify if you wanted to keep the theme & widget to get the best results. http://best-of-sacramento.com/dentists This is my staging site.
Technical SEO | | julie-getonthemap1 -
What's going on with google index - javascript and google bot
Hi all, Weird issue with one of my websites. The website URL: http://www.athletictrainers.myindustrytracker.com/ Let's take 2 diffrenet article pages from this website: 1st: http://www.athletictrainers.myindustrytracker.com/en/article/71232/ As you can see the page is indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:dfbzhHkl5K4J:www.athletictrainers.myindustrytracker.com/en/article/71232/10-minute-core-and-cardio&hl=en&strip=1 (that the "text only" version, indexed on May 19th) 2nd: http://www.athletictrainers.myindustrytracker.com/en/article/69811 As you can see the page isn't indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:KeU6-oViFkgJ:www.athletictrainers.myindustrytracker.com/en/article/69811&hl=en&strip=1 (that the "text only" version, indexed on May 21th) They both have the same code, and about the dates, there are pages that indexed before the 19th and they also problematic. Google can't read the content, he can read it when he wants to. Can you think what is the problem with that? I know that google can read JS and crawl our pages correctly, but it happens only with few pages and not all of them (as you can see above).
Technical SEO | | cobano0 -
Why is Google not indexing my site?
I'm a bit confused as to why my site just isn't indexing on Google. Even if I type in my brand name, my social channels rank and there's no evidence of my website. I've followed all of the advice I've read and gone into webmaster tools and got the Wordpress yoast plug-in but nothing seems to be making a difference!One thing I've noticed, in Google Webmaster Tools it says "Couldn’t communicate with the DNS server." in site errors. I've called GoDaddy and they said that everything is fine. A bit frustrating. Trying to work out what my next steps should be but feeling a bit lost to be honest! Any help GREATLY appreciated!
Technical SEO | | j1066s0 -
Can the Hosting location of image files have a negative effect if 'off-site' such as on the devs own media server ?
Hi Can the Hosting location of image files have a negative effect if 'off-site' such as if they are on the developers own media server ? As opposed to on the actual websites server or file structure ? In the case i'm looking at the image files are hosted on a totally separate server (a media subdomain of the developers site server) from the subject sites dedicated server. Will engines still attribute the properties of files hosted in this manner to the main website (such as file name, alt attributes, etc etc) ? Or should they really be on the subject sites server own media folder ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
What needs to be done to tell google my site has moved /changed
Hi everyone, I have a site, which I have re-built on a temporary domain, so that my main ecommerce site can still run.. I have noticed that google has already crawled my temporary domain. The only problem is I now want to transfer the new site back onto its proper domain (www.ourbrand.com). I have changed some of the URL structures of the new site so realize I will need to do re-directs relating to the same domain, but will google get confused that another domain used to have my new website on? I don't plan on using the old temporary domain again and wondered if I need to tell google in some way it was used just to build my site on? Michelle
Technical SEO | | nutjobshell0 -
More than one web domain
Hi What is best when considering using more than one domain on a website, what the best policy ? it's a question I get asked a lot, usually because prior to any seo efforts the main domain name purchased is not keyword rich or an abbreviated company name etc. what impact do new domains have on SEO compared to older existing domains- is it worth changing a generic company named domain for a keyword rich domain? if having multiple domains pointing to the site is beneficial how best is it to configure? How do i inform Google? How do both domains get index when there is only one physical site? How should i monitor it, Analytic s, SEO moz? Will the domain compete against each other and effect SEO rank? Are they best used for marketing purposes on external sites, adverts driving traffic to the main site. I'm aware there are lots of questions above, any answers/ opinions , links to further info would be greatly appreciated. cheers
Technical SEO | | Bristolweb0 -
Basic SEO HTML
Hello Everyone, One place I am weak is coding for SEO. I need to get better. One question I do have is can anyone explain why it's important to place css and java script files in an external file? How do you do this and how do you know if it's already being done? If it has not been done on a site is it hard to go back and do? I understand this is important from a site load time issue Thanks, Bill P.S. Can anyone recommend a resource where I can learn proper html coding for SEO? Thank you!
Technical SEO | | wparlaman0 -
Fastest Hosting Company
Who has the fastest hosting company? what major provider has fastest service for page load times? Looking for affordability like godaddy.
Technical SEO | | bozzie3110