Is there a whitelist of the RogerBot IP Addresses?
-
I'm all for letting Roger crawl my site, but it's not uncommon for malicious spiders to spoof the User-Agent string. Having a whitelist of Roger's IP addresses would be immensely useful!
-
Samantha (of the Moz team) suggested I have my client whitelist Rogerbot - so you are saying simply whitelist Rogerbot as a useragent? Is there any other information I need to provide?
-
Gotcha thanks for the response, Aaron.
-
Hey Kalen! Rogerbot is the crawler we use to gather data on websites for Moz Analytics and the Mozscape link index. Here's his info: http://moz.com/help/pro/what-is-rogerbot-.
I wish I could give you IP addresses, but they change all the time since we host Roger in the cloud. There's not even a reliable range of IPs to give you. You can totally whitelist the useragent rogerbot, but that's the only reliable information about the crawler you can go off of. I hope that helps but let me know if there's any other solution you can think of. Thank you!
-
Hi Aaron,
I'm not totally sure what RogerBot is, but I was also interested in a list of IPs to white list. We just completed a search crawl and are checking out the Crawl Diagnostics. It's hit some 503 errors b/c it's triggering our DoS filter.
Is there a way to get the IP addresses behind this crawl in order to white list them?
Thanks,
Kalen -
Hey there Outside!
I totally understand your concerns, but unfortunately we don't have a static IP we can give you for Rogerbot. He's crawling from the cloud so his IP address changes all the time! As you know, you can allow him in Robots.txt but that's the only way to do it for now. We have a recent post about why this may be risky business: http://www.seomoz.org/blog/restricting-robot-access-for-improved-seo
Hope that helps!
-
Personally, I've run across spiders that search for entry points and exploits in common CMS, e-commerce, and CRM web applications. For example, there was a recent Wordpress bug that could be exploited to serve malicious content (read: virus) to visiting users.
Spoofing the User-Agent string is elementary at best, and wouldn't fool any sys admin worth a salt. All you have to do is a WHOIS on the requested IP to help identify it's origin.
I'm a bit of a data geek, so I like to grep through log files to see things that won't show up in Analytics that require Javascript.
-
Out of curiosity (and because I don't know), what is the advantage for a malicious spider to spoof the User-Agent string? I mean, I understand this hides its identity, but why does a spider need to hide its identity? And what can a malicious spider do that a browsing human can't do? I haven't taken any action to prevent robots from anything on my site. Should I?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Rogerbot blocked by cloudflare and not display full user agent string.
Hi, We're trying to get MOZ to crawl our site, but when we Create Your Campaign we get the error:
Moz Pro | | BB_NPG
Ooops. Our crawlers are unable to access that URL - please check to make sure it is correct. If the issue persists, check out this article for further help. robot.txt is fine and we actually see cloudflare is blocking it with block fight mode. We've added in some rules to allow rogerbot but these seem to be getting ignored. If we use a robot.txt test tool (https://technicalseo.com/tools/robots-txt/) with rogerbot as the user agent this get through fine and we can see our rule has allowed it. When viewing the cloudflare activity log (attached) it seems the Create Your Campaign is trying to crawl the site with the user agent as simply set as rogerbot 1.2 but the robot.txt testing tool uses the full user agent string rogerbot/1.0 (http://moz.com/help/pro/what-is-rogerbot-, rogerbot-crawler+shiny@moz.com) albeit it's version 1.0. So seems as if cloudflare doesn't like the simple user agent. So is it correct the when MOZ is trying to crawl the site it uses the simple string of just rogerbot 1.2 now ? Thanks
Ben Cloudflare activity log, showing differences in user agent strings
2022-07-01_13-05-59.png0 -
Unsolved ip address
hi, im currently running my web without domain, can i use moz with my web ip address. thanks
Moz Pro | | ikuraa0 -
New URL, new physical address, New Name. 30 point drop in Domain Authority. Yikes.
I have a client who is asking for SEO help after renaming their business, getting a new URL, and somehow having an address change (without moving to a new location...weird...I know). This has set them back big time in terms of their domain authority (they went from a 46 to a 15 in DA). The web developers they work with put a 302 redirect in place from their old URL (home page), which had 10,477 links from 52 root domains, to their new URL's home page. Open site explorer shows that they now have 5 links! We can improve some of the local search set backs from the name and address change with a citation audit and clean up, but the domain name change is a killer. So here's my question or questions, really: Do we need to manually rebuild links with partner websites? I know there is debate around the actual link juice passed along from a 302 vs a 301 redirect (despite what has been publicly stated by Google). Or is this just a waiting game while old links get recrawled?
Moz Pro | | TheKatzMeow1 -
"redirects" with no "redirect address"?
Episode 2 of "Damon the idiot noob" web series . . . I have like . . ..90 plus temporary redirects in my moz "medium priority diagnostics". But the majority of them have a url, but no redicret url. How can it be a temporary redirect if there is no redirect address? Some of the addresses simply don't make any sense. Like: http://www.thirdcoastsigns.com/catalog/seo_sitemap/product How on earth would a "seo_sitemap" be followed my a "/product"? This is a Magento site, so I know some of these things get created automatically . .. but what on earth is going on here? Help welcome, appreciated, and welcome. Did I mention it is welcome and appreciated?
Moz Pro | | damon12120 -
Ajax4SEO and rogerbot crawling
Has anyone had any experience with seo4ajax.com and moz? The idea is that it points a bot to a html version of an ajax page (sounds good) without the need for ugly urls. However, I don't know how this will work with rogerbot and whether moz can crawl this. There's a section to add in specific user agents and I've added "rogerbot". Does anyone know if this will work or not? Otherwise, it's going to create some complications. I can't currently check as the site is in development and the dev version is noindexed currently. Thanks!
Moz Pro | | LeahHutcheon0 -
Rogerbot did not crawl my site ! What might be the problem?
When I saw the new crawl for my site I wondered why there are no errors, no warning and 0 notices anymore. Then I saw that only 1 page was crawled. There are no Error Messages or webmasters Tools also did not report anything about crawling problems. What might be the problem? thanks for any tips!
Moz Pro | | inlinear
Holger rogerbot-did-not-crawl.PNG0 -
IP Location at a glance - Moz toolbar
I am using the latest version of the moz toolbar in my firefox browser. The IP location feature is not working for me as demonstrated in the article. "You will now notice a new button with a flag that shows up in the toolbar as you surf the web. This new toolbar addition shows the country where this site is hosted. Click on the flag to see more details about the location and IP address. If you want to learn more, click on the IP to access WhoIs information." My flag icon is depressed and looks like a US flag, even when I browse European sites. Also, I never see the location nor IP address of other sites. Am I missing a step or is the tool not working correctly?
Moz Pro | | RyanKent0 -
What is the full User Agent of Rogerbot?
What's the exact string that Rogerbot send out as his UserAgent within the HTTP Request? Does it ever differ?
Moz Pro | | rightmove0