The Moz Q&A Forum

    • Forum
    • Questions
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Forum
    2. Categories
    3. SEO Tactics
    4. Intermediate & Advanced SEO
    5. Would you rate-control Googlebot? How much crawling is too much crawling?
    Moz Q&A is closed.

    After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

    Would you rate-control Googlebot? How much crawling is too much crawling?

    Intermediate & Advanced SEO
    4 3 1.9k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • lzhao
      lzhao last edited by

      One of our sites is very large - over 500M pages.   Google has indexed 1/8th of the site - and they tend to crawl between 800k and 1M pages per day.

      A few times a year, Google will significantly increase their crawl rate - overnight hitting 2M pages per day or more.  This creates big problems for us, because at 1M pages per day Google is consuming 70% of our API capacity, and the API overall is at 90% capacity.   At 2M pages per day, 20% of our page requests are 500 errors.

      I've lobbied for an investment / overhaul of the API configuration to allow for more Google  bandwidth without compromising user experience.   My tech team counters that it's a wasted investment - as Google will crawl to our capacity whatever that capacity is.

      Questions to Enterprise SEOs:

      *Is there any validity to the tech team's claim?  I thought Google's crawl rate was based on a combination of PageRank and the frequency of page updates.   This indicates there is some upper limit - which we perhaps haven't reached - but which would stabilize once reached.

      *We've asked Google to rate-limit our crawl rate in the past.   Is that harmful?  I've always looked at a robust crawl rate as a good problem to have.

      • Is 1.5M Googlebot API calls a day desirable, or something any reasonable Enterprise SEO would seek to throttle back?

      *What about setting a longer refresh rate in the sitemaps?   Would that reduce the daily crawl demand?  We could set increase it to a month, but at 500M pages Google could still have a ball at the 2M pages/day rate.

      Thanks

      1 Reply Last reply Reply Quote 0
      • CraigBradford
        CraigBradford last edited by

        I agree with Matt that there can probably be a reduction of pages, but that aside, how much of an issue this is comes down to what pages aren't being indexed. It's hard to advise without the site, are you able to share the domain? If the site has been around for a long time, that seems a low level of indexation. Is this a site where the age of the content matters? For example Craigslist?

        Craig

        1 Reply Last reply Reply Quote 0
        • lzhao
          lzhao @MattAntonino last edited by

          Thanks for your response.  I get where you're going with that.   (Ecomm store gone bad.)    It's not actually an Ecomm FWIW.  And I do restrict parameters - the list is about a page and a half long.  It's a legitimately large site.

          You're correct - I don't want Google to crawl the full 500M.   But I do want them to crawl 100M.  At the current crawl rate we limit them to, it's going to take Google more than 3 months to get to each page a single time.  I'd actually like to let them crawl 3M pages a day.   Is that an insane amount of Googlebot bandwidth?   Does anyone else have a similar situation?

          1 Reply Last reply Reply Quote 0
          • MattAntonino
            MattAntonino last edited by

            Gosh, that's a HUGE site. Are you having Google crawl parameter pages with that?  If so, that's a bigger issue.

            I can't imagine the crawl issues with 500M pages.  A site:amazon.com search only returns 200M. Ebay.com returns 800M so your site is somewhere in between these two?  (I understand both probably have a lot more - but not returning as indexed.)

            You always WANT a full site crawl - but your techs do have a point. Unless there's an absolutely necessary reason to have 500M indexed pages, I'd also seek to cut that to what you want indexed. That sounds like a nightmare ecommerce store gone bad.

            lzhao 1 Reply Last reply Reply Quote 2
            • 1 / 1
            • First post
              Last post

            Got a burning SEO question?

            Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


            Start my free trial


            Explore more categories

            • Moz Tools

              Chat with the community about the Moz tools.

              Getting Started
              Moz Pro
              Moz Local
              Moz Bar
              API
              What's New

            • SEO Tactics

              Discuss the SEO process with fellow marketers

              Content Development
              Competitive Research
              Keyword Research
              Link Building
              On-Page Optimization
              Technical SEO
              Reporting & Analytics
              Intermediate & Advanced SEO
              Image & Video Optimization
              International SEO
              Local SEO

            • Community

              Discuss industry events, jobs, and news!

              Moz Blog
              Moz News
              Industry News
              Jobs and Opportunities
              SEO Learn Center
              Whiteboard Friday

            • Digital Marketing

              Chat about tactics outside of SEO

              Affiliate Marketing
              Branding
              Conversion Rate Optimization
              Web Design
              Paid Search Marketing
              Social Media

            • Research & Trends

              Dive into research and trends in the search industry.

              SERP Trends
              Search Behavior
              Algorithm Updates
              White Hat / Black Hat SEO
              Other SEO Tools

            • Support

              Connect on product support and feature requests.

              Product Support
              Feature Requests
              Participate in User Research

            • See all categories

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            • Digital Marketers
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • Local Citation Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy

            Looks like your connection to Moz was lost, please wait while we try to reconnect.