The Moz Q&A Forum

    • Forum
    • Questions
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Forum
    2. Categories
    3. SEO Tactics
    4. Technical SEO
    5. Exclude status codes in Screaming Frog

    Moz Q&A is closed.

    After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

    Exclude status codes in Screaming Frog

    Technical SEO
    8 3 2.7k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DonnaDuncan
      DonnaDuncan last edited by

      I have a very large ecommerce site I'm trying to spider using screaming frog. Problem is I keep hanging even though I have turned off the high memory safeguard under configuration.

      The site has approximately 190,000 pages according to the results of a Google site: command.

      • The site architecture is almost completely flat. Limiting the search by depth is a possiblity, but it will take quite a bit of manual labor as there are literally hundreds of directories one level below the root.
      • There are many, many duplicate pages. I've been able to exclude some of them from being crawled using the exclude configuration parameters.
      • There are thousands of redirects. I haven't been able to exclude those from the spider b/c they don't have a distinguishing character string in their URLs.

      Does anyone know how to exclude files using status codes? I know that would help.

      If it helps, the site is kodylighting.com.

      Thanks in advance for any guidance you can provide.

      1 Reply Last reply Reply Quote 0
      • CHAD215
        CHAD215 last edited by

        Thanks for your help. It literally was just the fact that it had to be done before the crawl began and could not be changed during the crawl. Hopefully this is changed because sometimes during a crawl you find things you want to exclude that you may have not known of their existence before hand.

        1 Reply Last reply Reply Quote 0
        • MickEdwards
          MickEdwards @CHAD215 last edited by

          Are you sure it's just on Mac,have you tried on PC?   Do you have any other rules in include or perhaps a conflicting rule in exclude? Try running a single exclude rule, also on another small site to test.

          Also from support if failing on all fronts:

          • Mac version, please make sure you have the most up to date version of the OS which will update Java.
          • Please uninstall, then reinstall the spider ensuring you are using the latest version and try again.

          To be sure - http://www.youtube.com/watch?v=eOQ1DC0CBNs

          1 Reply Last reply Reply Quote 0
          • CHAD215
            CHAD215 last edited by

            does the exclude function work on mac. i have tried every possible way to exclude folders and have not been successful while running an analysis

            MickEdwards 1 Reply Last reply Reply Quote 0
            • DonnaDuncan
              DonnaDuncan @MickEdwards last edited by

              That's exactly the problem, the redirects are disbursed randomly throughout the site. Although, and the job's still running, it now appears as though there's almost a 1-2-1 correlation between pages and redirects on the site.

              I also heard from Dan Sharp via Twitter. He said "You can't, as we'd have to crawl a URL to see the status code 😉 You can right click and remove after though!"

              Thanks again Michael. Your thoroughness and follow through is appreciated.

              1 Reply Last reply Reply Quote 0
              • MickEdwards
                MickEdwards @DonnaDuncan last edited by

                Took another look, also looked at documentation/online and don't see any way to exclude URLs from crawl based on response codes.  As I see it you would only want to exclude on name or directory as response code is likely to be random throughout a site and impede a thorough crawl.

                DonnaDuncan 1 Reply Last reply Reply Quote 1
                • DonnaDuncan
                  DonnaDuncan @MickEdwards last edited by

                  Thank you Michael.

                  You're right. I was on a 64 bit machine running a 32 bit verson of java. I updated it and the scan has been running for more than 24 hours now without hanging. So thank you.

                  If anyone else knows of a way to exclude files using status codes I'd still like to learn about it. So far the scan is showing me 20,000 redirected files which I'd just as soon not inventory.

                  MickEdwards 1 Reply Last reply Reply Quote 0
                  • MickEdwards
                    MickEdwards last edited by

                    I don't think you can filter out on response codes.

                    However, first I would ensure you are running the right version of Java if you are on a 64bit machine.  The 32bit version functions but you cannot increase the memory allocation which is why you could be running into problems.  Take a look at http://www.screamingfrog.co.uk/seo-spider/user-guide/general/ under Memory.

                    DonnaDuncan 1 Reply Last reply Reply Quote 1
                    • 1 / 1
                    • First post
                      Last post

                    Got a burning SEO question?

                    Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


                    Start my free trial


                    Explore more categories

                    • Moz Tools

                      Chat with the community about the Moz tools.

                      Getting Started
                      Moz Pro
                      Moz Local
                      Moz Bar
                      API
                      What's New

                    • SEO Tactics

                      Discuss the SEO process with fellow marketers

                      Content Development
                      Competitive Research
                      Keyword Research
                      Link Building
                      On-Page Optimization
                      Technical SEO
                      Reporting & Analytics
                      Intermediate & Advanced SEO
                      Image & Video Optimization
                      International SEO
                      Local SEO

                    • Community

                      Discuss industry events, jobs, and news!

                      Moz Blog
                      Moz News
                      Industry News
                      Jobs and Opportunities
                      SEO Learn Center
                      Whiteboard Friday

                    • Digital Marketing

                      Chat about tactics outside of SEO

                      Affiliate Marketing
                      Branding
                      Conversion Rate Optimization
                      Web Design
                      Paid Search Marketing
                      Social Media

                    • Research & Trends

                      Dive into research and trends in the search industry.

                      SERP Trends
                      Search Behavior
                      Algorithm Updates
                      White Hat / Black Hat SEO
                      Other SEO Tools

                    • Support

                      Connect on product support and feature requests.

                      Product Support
                      Feature Requests
                      Participate in User Research

                    • See all categories

                    • Question about a Screaming Frog crawling issue
                      KyleSennikoff
                      KyleSennikoff
                      0
                      5
                      1.9k

                    • 422 vs 404 Status Codes
                      AfroSEO
                      AfroSEO
                      0
                      5
                      2.4k

                    • Can you use Screaming Frog to find all instances of relative or absolute linking?
                      Merkle-Impaqt
                      Merkle-Impaqt
                      0
                      5
                      4.1k

                    • Canonical issues using Screaming Frog and other tools?
                      Flock.Media
                      Flock.Media
                      0
                      3
                      4.2k

                    • Screaming Frog showing 503 status code. Why?
                      EcommerceSite
                      EcommerceSite
                      0
                      6
                      2.5k

                    • Screaming Frog Content Showing charset=UTF-8
                      seoessentials
                      seoessentials
                      0
                      2
                      1.1k

                    Get started with Moz Pro!

                    Unlock the power of advanced SEO tools and data-driven insights.

                    Start my free trial
                    Products
                    • Moz Pro
                    • Moz Local
                    • Moz API
                    • Moz Data
                    • STAT
                    • Product Updates
                    Moz Solutions
                    • SMB Solutions
                    • Agency Solutions
                    • Enterprise Solutions
                    • Digital Marketers
                    Free SEO Tools
                    • Domain Authority Checker
                    • Link Explorer
                    • Keyword Explorer
                    • Competitive Research
                    • Brand Authority Checker
                    • Local Citation Checker
                    • MozBar Extension
                    • MozCast
                    Resources
                    • Blog
                    • SEO Learning Center
                    • Help Hub
                    • Beginner's Guide to SEO
                    • How-to Guides
                    • Moz Academy
                    • API Docs
                    About Moz
                    • About
                    • Team
                    • Careers
                    • Contact
                    Why Moz
                    • Case Studies
                    • Testimonials
                    Get Involved
                    • Become an Affiliate
                    • MozCon
                    • Webinars
                    • Practical Marketer Series
                    • MozPod
                    Connect with us

                    Contact the Help team

                    Join our newsletter
                    Moz logo
                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                    • Accessibility
                    • Terms of Use
                    • Privacy

                    Looks like your connection to Moz was lost, please wait while we try to reconnect.