Welcome to the Q&A Forum

ChiarynMiranda

Hey Liam,

Thanks for following up. Unfortunately, we use thousands of dynamic IPs through Amazon Web Services to run our crawler and the IP would change from crawl to crawl. We don't even have a set range for the IPs we use through AWS.

As for throttling, we don't have a set throttle. We try to space out the server hits enough to not bring down the server, but then hit the server as often as necessary in order to crawl the full site or crawl limit in a reasonable amount of time. We try to find a balance between hitting the site too hard and having extremely long crawl times. If the devs are worried about how often we hit the server, they can add a crawl delay of 10 to the robots.txt to throttle the crawler. We will respect that delay.

If the devs use Moz, as well, they would also be getting a 403 on their crawl because the server is blocking our user agent specifically. The server would give the same status code regardless of who has set up the campaign.

I'm sorry this information isn't more specific. Please let me know if you need any other assistance.

Chiaryn

ChiarynMiranda

Hey There,

The robots.txt shouldn't really affect 403s; you would actually get a "blocked by robots.txt" error if that was the cause. Your server is basically telling us that we are not authorized to access your site. I agree with Mat that we are most likely being blocked in the htaccess file. It may be that your server is flagging our crawler and Xenu's crawler as troll crawlers or something along those lines. I ran a test on your URL using a non-existent crawler, Rogerbot with a capital R, and got a 200 status code back but when I run the test with our real crawler, rogerbot with a lowercase r, I get the 403 error (http://screencast.com/t/Sv9cozvY2f01). This tells me that the server is specifically blocking our crawler, but not all crawlers in general.

I hope this helps. Let me know if you have any other questions.

Chiaryn
Help Team Ninja

ChiarynMiranda

Hey David,

Great question! Unfortunately, there isn't a way to pull this data through Followerwonk. I'm sorry about that!

I would recommend submitting a request for this data to our Feature Request forum, but we would only be able to track this information for accounts that are being tracked in Followerwonk already at the time one user starts following the other. So, if someone started following you before you were tracking your Twitter account in Followerwonk, we wouldn't be able to provide when they started following you.

Here's the forum we use to collect feature ideas:
http://seomoz.zendesk.com/forums/293194-seomoz-pro-feature-requests

I hope this helps. Please let me know if you have any other questions.

Chiaryn
Help Team Ninja

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

ChiarynMiranda

@ChiarynMiranda

Posts made by ChiarynMiranda

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved