SeoMoz Crawler Shuts Down The Website Completely
-
Recently I have switched servers and was very happy about the outcome. However, every friday my site shuts down (not very cool if you are getting 700 unique visitors per day). Naturally I was very worried and digged deep to see what is causing it. Unfortunately, the direct answer was that is was coming from "rogerbot". (see sample below)
Today (aug 5) Same thing happened but this time it was off for about 7 hours which did a lot of damage in terms of seo. I am inclined to shut down the seomoz service if I can't resolve this immediately.
I guess my question is would there be a possibility to make sure this doesn't happen or time out like that because of roger bot. Please let me know if anyone has answer for this. I use your service a lot and I really need it.
Here is what caused it from these error lines:
216.244.72.12 - - [29/Jul/2011:09:10:39 -0700] "GET /pregnancy/14-weeks-pregnant/ HTTP/1.1" 200 354 "-" "Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)"
216.244.72.11 - - [29/Jul/2011:09:10:37 -0700] "GET /pregnancy/17-weeks-pregnant/ HTTP/1.1" 200 51582 "-" "Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)"
-
After much research and implementing ton of added scripts on my apache server to track it - the bots did effect the shutdown. However, for this not to happen to you or if you ever have a problem of that nature this is how I resolved it.
It is an excellent article about how to implement the script to restart immediately once all available threads for apache are exhausted and your apache crashes. The script basically check apache server status every 5 min and in an event that it crashed - it will automatically restart it and send you an email notification. I say pretty good deal for risking to be offline only 5 min if anything major happens. Just as well I am also running a cron job every morning at 1am to restart apache. Please note that you need to have knowledge of SSH commands and manipulations in order for this to happen. And OMG I am talking like a geek... All the best to you...
-
Wow Randy, what a story man. Actually the funny part is one of the jobs I do is monitor for things like that - but I would not go that far to actually shut someone's site down - precisely for the reason of knowing what that could do. It is great thing to know that for 5 days you still preserved your ranking. That makes me feel so much better. I am keeping the rule of 1 dedicated server per 2 domains (both related). In this whole case we are talking about a domain called babylifetime.com. I am about to embark on a journey of custom development for site similar to squarespace.com but with much more addons - so I need this thing to work properly. I think I got this SEO in organic arena pretty well, but again things like the issue in this thread are what is keeping me on my toes.
-
Googlebot would have to be indexing your site at the very moment that it was down for anything to happen and even if it's down for a half a day, from my experience, rankings are unaffected.
However, there's a small side-effect. If visitors that are coming from X, Y or Z engine, visit your site and there is a 404 or Server Error and they click the back button or get the "Google Can't Find This" page, it can, for that period of time increase your bounce rate. If the originating click starts at say Google and then the clicker goes back to google, it tells google that the page wasn't what they were looking for in relation to the term that they used, or that it didn't load, or that there is a problem with it. Basically any reason that can be tied to bounce rate.
As alarming as that may sound, I don't believe that it would effect your rankings.
The easiest way to see if Google noticed is to log in to your Google Webmaster Tools account and check for errors. If they list any errors such as 404 or "server unavailable" (which I'm not sure they have that one) for any pages that you know are usually live and well, then you'll know they noticed.
But again, I'm not under the belief that it will largely effect your rankings. I've read from Google's words that they do go back to sites that were unavailable or down and try to continue their index.
As for your server being down for 12 hours. That's a lengthy amount of time. I can't even imagine it. You might want to check your hosting capabilities. You should be back up and running in minutes, not hours.
Just to give you a some piece of mind. I have a plethora of affiliate sites that make a small income for me. I once registered a domain name that a very large corporation didn't appreciate. It had a trademarked word in the domain. Long story short, my domain info was set to private so they got legally got the server shut down. I didn't know for days because everything was on auto-pilot and I wasn't checking my related email addresses. When that server was shut down, 100+ websites on that server went down too because that one trademarked (partially) domain was on the same server and same hosting package. The sites were down for about 5 or 6 days while I sorted through the legal paperwork. After I made an agreement to give the big company the domain, minus the 20K in damages that they originally wanted, the hosting company turned the server and hosting package back on.
Not a single one of the domains lost ranking. Not even 1 spot! Today, they still rank in the top 2 to 3 of their biggest terms. So my words are truly from experience and are from a worst-case scenario. I think you'll be fine.
Finally, to clear the air. I didn't do anything bad, nor would I ever do anything bad with a domain name (other than keep it in my portfolio). The big company was upset that I got the domain before they did. All I had on the index page was their description of their product that was named in the domain. That was enough to be taken down for copyright and trademark infringement.
In the end, that company was actually very cool about it. And it's a Fortune 10 company! I was surprised!
-
EGOL thanks for your reply.
A) Also my latest though is that unusual activity is blocking it. But then again, it is dedicated server and should be capable of handling it separately. We are talking about SeoMoz bot and highest dedicated GoDaddy server. Without anything specifically installed to interfere with apache server.
B) RAM, bandwidth, space, PHP memory and other memory limits etc. is all under 20% of actual use.
-
I am willing to bet that the root issue is with the host and one of these situations is occurring: A) the host is throttling your processor resources and shutting your domain down after unusual activity occurs on your site.... B) total activity on the server (your site and other sites) exceed a certain level and the server limits resource available for processing.
I would be looking for a new host.
-
Randy thanks for the response. There is definitely something going on related directly to rogerbot on the server. I have different crawlers running at all times and nothing ever happens. This particular problem ties in when seomoz bots start doing their job (fridays) and is backtracked to specific bot.As for delay. I tried different ones up to 20 - but same problem persists.
At the moment I have tech team reviewing apache server to see specifics of this. I will also post it here for other to see when I find out.
But it is weird and now I don't know when the site will shut down. Driving me crazy man!
As additional question to this thread: When your site goes down for lets say 12 hours and you have many organic google high ranked listings. Does that have huge impact or what is acceptable?
-
Jury,
I'm not sure if rogerBot is doing anything to you site but I do know a way to slow rogerBot and any other robot / crawler which takes directions from the robots.txt file that should be on your site.
Basically, just add the two lines of code that are represented below to your robots.txt file. With this addition, you are telling the useragent (rogerBot) to take 10 seconds between pages. You can change that number to anything you want. The more seconds you add, the slower it goes. And this of course is if rogerBot takes directions. I'm fairly sure it does!
NON-AGENT SPECIFIC EXAMPLE
User-Agent: *
Crawl-Delay: 10
EXAMPLE FOR ROGERBOT
User-Agent: rogerBot
Crawl-Delay: 10
Good Luck,
Randy -
Thanks Lewis...I will do that and see if they have any suggestions...!
-
Hi Jury
If you haven't already i would recommend raising the issue through the help email address help@seomoz.org
On the Q&A forum we can pass thoughts or suggestions but the support team at seomoz will be best placed to answer this.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Strange rankings on new website
HI All My website is 10 years old, and has decent rankings. The domain is www.advanced-driving.co.uk I have recently had a major overhaul of the site, before it was very outdated, with lots of duplicated content. My main keywords are "advanced driving course" and "advanced driving courses" both of which I am on page 1. However, since I have been live with new site - (5 days) I am not ranking for some easy win keywords. I have submitted new content thought webmaster tools, and whilst some content is ranking, others are not. The content not ranking is fresh and unique ( have used copyscape on all new pages). For example my homepage is on page 1 for "advanced driving courses london" - around rank 6. So I hand made some content titled advanced driving courses london to provide more of an exact match, outlining our courses in London and the routes we take - http://www.advanced-driving.co.uk/defensive-advanced-driving-courses-london/ However, this page which is unique does not rank at all....I have done this with another website and it worked well, but google is not understanding this at all. Also I am now on page 1 for "advanced driving course" but not for "advanced driving courses" - well I am but the page for the plural keyword is a page not really related - surely Googles semantic search should realise course and courses are the same! I suspect that Google is still getting used to my new website? No errors or anything in Webmaster tools... Can anyone confirm this - or outline if I have done something awful..!! Thanks Rob
Intermediate & Advanced SEO | | robert780 -
Rel=canonical an iframed version of the same website?
My issue is that we have two websites with the same content. For the sake of an example lets say they are: jackson.com jacksonboats.com When you go to jacksonboats.com, the website is an iframed version of jackson.com. However all of the companies email addresses are example@jacksonboats.com so a 301 is not possible. What would be the best way to forward over the link juice from jacksonboats.com to jackson.com? I'm thinking a rel=canonical tag, but I wanted to ask first. Thanks,
Intermediate & Advanced SEO | | BenGMKT0 -
How do I use old websites to best effect?
I own a couple of old sites with DA of 15 and 17 which don't really rank for anything, as well as my main site which as DA of 29. Can I forward these domains to my main site to increase the DA of my main site. Alternatively is there any other way of making use of these sites?
Intermediate & Advanced SEO | | benacuity0 -
Where is the SEOmoz search operator guide?
It was available on this URL: http://www.seomoz.org/article/the-professionals-guide-to-advanced-search-operators but I can't seem to find it anymore. Anyone know where it is?
Intermediate & Advanced SEO | | Chuck-Boom0 -
Future-proof website to optimize SEO.
Hi All, This is my first post and hopefully a question that could help others in similar positions. Say we are trying to rank for the keyword "security testing tools". Product name is "Sectest" and its a security testing tool. *We currently have an "SEO" section that is purely good content and the idea with this is to be able to rank for "security testing tools" talking about what to expect and look for in such tools and relevant content - Linking to our product page at the end of it. structure is brand.com/security-testing/tools and that would have a link to brank.com/products/sectest Obviously product pages would get their meta tags and content re-written so we don't compete for the same keywords. Is this approach optimal? or would google want us to link directly to the product page instead of "information" about security testing tools? Nobody in our sector is taking this approach and we have already started it, but I am starting to wonder if I am getting into big trouble further down the line. Thanks and best regards,
Intermediate & Advanced SEO | | JorgeGarcia0 -
Production and Priority Issue for SEO and Website Usability
I am a NOVICE .........My website is about 4 months old. My developer/programmer only has 4-6 hours of work a week so it is going to take 4 months to finish two weeks of work. So I have to prioritize the things that are best for SEO (Our architecture is PHP,Apache and Zend) .** If you are interested I would be curious to how you would prioritize some or all of these. Or at least as many as you can until you get bored.** 1. Optimizing Cart/Conversion - 7 hrs - (Extremely low conversion rates)
Intermediate & Advanced SEO | | Boodreaux
2. Optimizing Speed for usability -10+ hrs (Very slow on initial load time) 10-14 sec
3. Filling in all Titles and Metadata - 2 hrs
4. Contact persistence with cookie...enter data only once. - 2 hrs
5. Social panels for sharing content - 3 hrs
6. Custom notifications for those who opt in. for updates - 5 hrs
7. Shorten 12 key URL's and optimize with key words - 3 hrs (I rank this very high)
8. Install Wordpress Blog - 5-10 hrs
9. RSS Feed - 5 hrs ( Run a feed real time on side of page)
10. Create Content Management System for me - 20 hrs (So I can make changes)
11. Keywords for H-1 Tags - 1 hr
12. At tag for images - 1 hr
13. Use of bold /italics - 2 hrs
14. Canonical tag in head - 3 hrs Any expert advice will be greatly appreciated. Boodreaux PS After studying SEO for 1 month I think the priorities should be #7,#3, #2, #1, #5 (on landing pages) #11, #12,#6, #4, #13, #14, #8, #9, #100 -
What is the best process to move a wordpress website ?
Hello Seomoz community, Simple question , i am looking forward to move a word press website from blog.domain.com sub domain to domain.com/blog to increase my indexed link on the root domain indexed by search engine.The blog i want to move already have high PR ( 6 ) i , of course want to avoid broken link , already indexed in search engine. What would be the best way to process to prepare this move accordingly on a SEO perspective ??? Many thanks in advance. Yan Desjardins
Intermediate & Advanced SEO | | SherWeb0 -
Website Restructure - Good or Bad for SEO?
Due to the fact that we aren't in the #1 position, (dropped from #5 to page 2 - You have to love Devs and IT), our heads have hired a SEO Audit/Consultant company to review everything we are doing. I would like to post some of the things they are telling us to do, in which I don't 100% agree with and would like some other professional feedback. Especially since their site isn't marketed very well. http://www.trupanionpetinsurance.com Disclaimer: (this site was a complete nightmare when I started a year and a half ago. Yes, there are many issues that still need to be addressed.) Website Restructure I agree we totally need to restructure our website. I have no idea what the previous SEO guy was thinking. The new SEO company is telling us that the structure is a big part of SEO. I don't believe so, but besides a little loss in 301 juice, is there any other downfalls? Are there any real benefits? Similar question asked the other day (and answered by me): http://www.seomoz.org/q/don-t-want-to-lose-page-rank-what-s-the-best-way-to-restructure-a-url-other-than-a-301-redirect
Intermediate & Advanced SEO | | Trupanion1