The Moz Q&A Forum

    • Forum
    • Questions
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Forum
    2. Categories
    3. Moz Tools
    4. Moz Pro
    5. Moz & Xenu Link Sleuth unable to crawl a website (403 error)

    Moz Q&A is closed.

    After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

    Moz & Xenu Link Sleuth unable to crawl a website (403 error)

    Moz Pro
    7 3 6.4k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ZaddleMarketing
      ZaddleMarketing last edited by

      It could be that I am missing something really obvious however we are getting the following error when we try to use the Moz tool on a client website. (I have read through a few posts on 403 errors but none that appear to be the same problem as this)

      Moz Result

      Title 403 : Error

      Meta Description 403 Forbidden

      Meta Robots_Not present/empty_

      Meta Refresh_Not present/empty_

      Xenu Link Sleuth Result

      Broken links, ordered by link:

      error code: 403 (forbidden request), linked from page(s):
      
      Thanks in advance!
      
      1 Reply Last reply Reply Quote 0
      • ChiarynMiranda
        ChiarynMiranda @ZaddleMarketing last edited by

        Hey Liam,

        Thanks for following up. Unfortunately, we use thousands of dynamic IPs through Amazon Web Services to run our crawler and the IP would change from crawl to crawl. We don't even have a set range for the IPs we use through AWS.

        As for throttling, we don't have a set throttle. We try to space out the server hits enough to not bring down the server, but then hit the server as often as necessary in order to crawl the full site or crawl limit in a reasonable amount of time. We try to find a balance between hitting the site too hard and having extremely long crawl times. If the devs are worried about how often we hit the server, they can add a crawl delay of 10 to the robots.txt to throttle the crawler. We will respect that delay.

        If the devs use Moz, as well, they would also be getting a 403 on their crawl because the server is blocking our user agent specifically. The server would give the same status code regardless of who has set up the campaign.

        I'm sorry this information isn't more specific. Please let me know if you need any other assistance.

        Chiaryn

        1 Reply Last reply Reply Quote 0
        • ZaddleMarketing
          ZaddleMarketing @ChiarynMiranda last edited by

          Hi Chiaryn

          The sage continues....this is the response my client got back from the developers - please could you let me have the answers to the two questions?

          Apparently as part of their ‘SAF’ (?) protocols, if the IT director sees a big spike in 3<sup>rd</sup> party products trawling the site he will block them! They did say that they use moz too.  What they’ve asked me to get from moz is:

          • Moz IP address/range
          • Level of throttling they will use

          I would question that if THEY USE MOZ themselves why would they need these answers but if I go back with that I will be going around in circles - any chance of letting me know the answer(s)?

          Thanks in advance.

          Liam

          ChiarynMiranda 1 Reply Last reply Reply Quote 0
          • ZaddleMarketing
            ZaddleMarketing @ChiarynMiranda last edited by

            Awesome - thank you.

            Kind Regards

            Liam

            1 Reply Last reply Reply Quote 0
            • ChiarynMiranda
              ChiarynMiranda last edited by

              Hey There,

              The robots.txt shouldn't really affect 403s; you would actually get a "blocked by robots.txt" error if that was the cause. Your server is basically telling us that we are not authorized to access your site. I agree with Mat that we are most likely being blocked in the htaccess file. It may be that your server is flagging our crawler and Xenu's crawler as troll crawlers or something along those lines. I ran a test on your URL using a non-existent crawler, Rogerbot with a capital R, and got a 200 status code back but when I run the test with our real crawler, rogerbot with a lowercase r, I get the 403 error (http://screencast.com/t/Sv9cozvY2f01). This tells me that the server is specifically blocking our crawler, but not all crawlers in general.

              I hope this helps. Let me know if you have any other questions.

              Chiaryn
              Help Team Ninja

              ZaddleMarketing 2 Replies Last reply Reply Quote 2
              • ZaddleMarketing
                ZaddleMarketing last edited by

                Hi Mat

                Thanks for the reply - robots.txt file is as follows:

                ## The following are infinitely deep trees
                User-agent: *
                Disallow: /cgi-bin
                Disallow: /cms/events
                Disallow: /cms/latest
                Disallow: /cms/cookieprivacy
                Disallow: /cms/help
                Disallow: /site/services/megamenu/
                Disallow: /site/mobile/
                
                I can't get access to the .htaccess file at present (we're not the developers)
                
                Anyone else any thoughts? Weirdly I can get Screaming Frog info back on the site :-/
                
                1 Reply Last reply Reply Quote 0
                • matbennett
                  matbennett last edited by

                  403s are tricky to diagnose because they, by their very nature, don't tell you much.  They're sort of the server equivalent of just shouting "NO!".

                  You say Moz & Xenu are receiving the 403. I assume that it loads properly from a browser.

                  I'd start looking at the .htaccess .  Any odd deny statements in there?  It could be that an IP range or user agent is blocked.  Some people like to block common crawlers (Not calling Roger names there).  Check the robots.txt whilst you are there, although that shouldn't return a 403 really.

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post

                  Got a burning SEO question?

                  Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


                  Start my free trial


                  Explore more categories

                  • Moz Tools

                    Chat with the community about the Moz tools.

                    Getting Started
                    Moz Pro
                    Moz Local
                    Moz Bar
                    API
                    What's New

                  • SEO Tactics

                    Discuss the SEO process with fellow marketers

                    Content Development
                    Competitive Research
                    Keyword Research
                    Link Building
                    On-Page Optimization
                    Technical SEO
                    Reporting & Analytics
                    Intermediate & Advanced SEO
                    Image & Video Optimization
                    International SEO
                    Local SEO

                  • Community

                    Discuss industry events, jobs, and news!

                    Moz Blog
                    Moz News
                    Industry News
                    Jobs and Opportunities
                    SEO Learn Center
                    Whiteboard Friday

                  • Digital Marketing

                    Chat about tactics outside of SEO

                    Affiliate Marketing
                    Branding
                    Conversion Rate Optimization
                    Web Design
                    Paid Search Marketing
                    Social Media

                  • Research & Trends

                    Dive into research and trends in the search industry.

                    SERP Trends
                    Search Behavior
                    Algorithm Updates
                    White Hat / Black Hat SEO
                    Other SEO Tools

                  • Support

                    Connect on product support and feature requests.

                    Product Support
                    Feature Requests
                    Participate in User Research

                  • See all categories

                  • Ive been using moz for just a minute now , i used it to check my website and find quite a number of errors , unfortunately i use a wordpress website and even with the tips , is till dont know how to fix the issues.
                    Dogara
                    Dogara
                    0
                    5
                    1.1k

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy

                  Looks like your connection to Moz was lost, please wait while we try to reconnect.