Skip to content
    Moz logo Menu open Menu close
    • Products
      • Moz Pro
      • Moz Pro Home
      • Moz Local
      • Moz Local Home
      • STAT
      • Moz API
      • Moz API Home
      • Compare SEO Products
      • Moz Data
    • Free SEO Tools
      • Domain Analysis
      • Keyword Explorer
      • Link Explorer
      • Competitive Research
      • MozBar
      • More Free SEO Tools
    • Learn SEO
      • Beginner's Guide to SEO
      • SEO Learning Center
      • Moz Academy
      • MozCon
      • Webinars, Whitepapers, & Guides
    • Blog
    • Why Moz
      • Digital Marketers
      • Agency Solutions
      • Enterprise Solutions
      • Small Business Solutions
      • The Moz Story
      • New Releases
    • Log in
    • Log out
    • Products
      • Moz Pro

        Your all-in-one suite of SEO essentials.

      • Moz Local

        Raise your local SEO visibility with complete local SEO management.

      • STAT

        SERP tracking and analytics for enterprise SEO experts.

      • Moz API

        Power your SEO with our index of over 44 trillion links.

      • Compare SEO Products

        See which Moz SEO solution best meets your business needs.

      • Moz Data

        Power your SEO strategy & AI models with custom data solutions.

      Track AI Overviews in Keyword Research
      Moz Pro

      Track AI Overviews in Keyword Research

      Try it free!
    • Free SEO Tools
      • Domain Analysis

        Get top competitive SEO metrics like DA, top pages and more.

      • Keyword Explorer

        Find traffic-driving keywords with our 1.25 billion+ keyword index.

      • Link Explorer

        Explore over 40 trillion links for powerful backlink data.

      • Competitive Research

        Uncover valuable insights on your organic search competitors.

      • MozBar

        See top SEO metrics for free as you browse the web.

      • More Free SEO Tools

        Explore all the free SEO tools Moz has to offer.

      NEW Keyword Suggestions by Topic
      Moz Pro

      NEW Keyword Suggestions by Topic

      Learn more
    • Learn SEO
      • Beginner's Guide to SEO

        The #1 most popular introduction to SEO, trusted by millions.

      • SEO Learning Center

        Broaden your knowledge with SEO resources for all skill levels.

      • On-Demand Webinars

        Learn modern SEO best practices from industry experts.

      • How-To Guides

        Step-by-step guides to search success from the authority on SEO.

      • Moz Academy

        Upskill and get certified with on-demand courses & certifications.

      • MozCon

        Save on Early Bird tickets and join us in London or New York City

      Access 20 years of data with flexible pricing
      Moz API

      Access 20 years of data with flexible pricing

      Find your plan
    • Blog
    • Why Moz
      • Digital Marketers

        Simplify SEO tasks to save time and grow your traffic.

      • Small Business Solutions

        Uncover insights to make smarter marketing decisions in less time.

      • Agency Solutions

        Earn & keep valuable clients with unparalleled data & insights.

      • Enterprise Solutions

        Gain a competitive edge in the ever-changing world of search.

      • The Moz Story

        Moz was the first & remains the most trusted SEO company.

      • New Releases

        Get the scoop on the latest and greatest from Moz.

      Surface actionable competitive intel
      New Feature

      Surface actionable competitive intel

      Learn More
    • Log in
      • Moz Pro
      • Moz Local
      • Moz Local Dashboard
      • Moz API
      • Moz API Dashboard
      • Moz Academy
    • Avatar
      • Moz Home
      • Notifications
      • Account & Billing
      • Manage Users
      • Community Profile
      • My Q&A
      • My Videos
      • Log Out

    The Moz Q&A Forum

    • Forum
    • Questions
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. Home
    2. SEO Tactics
    3. Intermediate & Advanced SEO
    4. XML Sitemap Index Percentage (Large Sites)

    Moz Q&A is closed.

    After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

    XML Sitemap Index Percentage (Large Sites)

    Intermediate & Advanced SEO
    2
    2
    1397
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with question management privileges can see it.
    • danng
      danng last edited by

      Hi all

      I'm wanting to find out from those who have experience dealing with large sites (10s/100s of millions of pages).

      What's a typical (or highest) percentage of indexed pages vs. submitted pages you've seen? This information can be found in webmaster tools where Google shows you the pages submitted & indexed for each of your sitemap.

      I'm trying to figure out whether,

      • The average index % out there
      • There is a ceiling (i.e. will never reach 100%)
      • It's possible to improve the indexing percentage further

      Just to give you some background, sitemap index files (according to schema.org) have been implemented to improve crawl efficiency and I'm wanting to find out other ways to improve this further.

      I've been thinking about looking at the URL parameters to exclude as there are hundreds (e-commerce site) to help Google improve crawl efficiency and utilise the daily crawl quote more effectively to discover pages that have not been discovered yet.

      However, I'm not sure yet whether this is the best path to take or I'm just flogging a dead horse if there is such a ceiling or if I'm already at the average ballpark for large sites.

      Any suggestions/insights would be appreciated. Thanks.

      1 Reply Last reply Reply Quote 0
      • Sam_McRoberts
        Sam_McRoberts last edited by

        I've worked on a site that was ~100 million pages, and I've seen indexation percentages ranging from 8% to 95%. When dealing with sites this size, there are so, so many issues at play, and there are so few sites of this size that finding an average probably won't do you much good.

        Rather than focusing on whether or not you have enough pages indexed based on averages, you should focus on two key questions: "do my sitemaps only include pages that would make great search engine entry pages" and "have I done everything possible to eliminate junk pages that are wasting crawl bandwidth."

        Of course, making sure you don't have any duplicate content, thin content, or poor on-site optimization issues should also be a focus.

        I guess what I'm trying to say is, I believe any site can have 100% of it's search entry worthy pages indexed, but sites of that size rarely have ALL of their pages indexed since sites that large often have a ton of pages that don't make great search results.

        1 Reply Last reply Reply Quote 0
        • 1 / 1
        • First post
          Last post

        Got a burning SEO question?

        Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


        Start my free trial


        Browse Questions

        Explore more categories

        • Moz Tools

          Chat with the community about the Moz tools.

        • SEO Tactics

          Discuss the SEO process with fellow marketers

        • Community

          Discuss industry events, jobs, and news!

        • Digital Marketing

          Chat about tactics outside of SEO

        • Research & Trends

          Dive into research and trends in the search industry.

        • Support

          Connect on product support and feature requests.

        • See all categories

        Related Questions

        • TyEl

          XML sitemap generator only crawling 20% of my site

          Hi guys, I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site.  The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap. How should I go about this? Thanks

          Intermediate & Advanced SEO | | TyEl
          0
        • cuarto715

          Google Is Indexing my 301 Redirects to Other sites

          Long story but now i have a few links from my site 301 redirecting to youtube videos or eCommerce stores. They carry a considerable amount of traffic that i benefit from so i can't take them down, and that traffic is people from other websites, so basically i have backlinks from places that i don't own, to my redirect urls (Ex. http://example.com/redirect) My problem is that google is indexing them and doesn't let them go, i have tried blocking that url from robots.txt but google is still indexing it uncrawled, i have also tried allowing google to crawl it and adding noindex from robots.txt, i have tried removing it from GWT but it pops back again after a few days. Any ideas? Thanks!

          Intermediate & Advanced SEO | | cuarto715
          0
        • McTaggart

          Why do people put xml sitemaps in subfolders? Why not just the root? What's the best solution?

          Just read this: "The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/." here: http://www.sitemaps.org/protocol.html#location Yet surely it's better to put the sitemaps at the root so you have:
          (a) http://example.com/sitemap.xml 
          http://example.com/sitemap-chocolatecakes.xml
          http://example.com/sitemap-spongecakes.xml 
          and so on... OR this kind of approach - 
          (b) http://example/com/sitemap.xml
          http://example.com/sitemap/chocolatecakes.xml and 
          http://example.com/sitemap/spongecakes.xml I would tend towards (a) rather than (b) - which is the best option? Also, can I keep the structure the same for sitemaps that are subcategories of other sitemaps - for example - for a subcategory of http://example.com/sitemap-chocolatecakes.xml I might create http://example.com/sitemap-chocolatecakes-cherryicing.xml - or should I add a sub folder to turn it into http://example.com/sitemap-chocolatecakes/cherryicing.xml Look forward to reading your comments - Luke

          Intermediate & Advanced SEO | | McTaggart
          0
        • ang

          URL structure change and xml sitemap

          At the end of April we changed the url structure of most of our pages and 301 redirected the old pages to the new ones. The xml sitemaps were also updated at that point to reflect the new url structure. Since then Google has not indexed the new urls from our xml sitemaps and I am unsure of why. We are at 4 weeks since the change, so I would have thought they would have indexed the pages by now. Any ideas on what I should check to make sure pages are indexed?

          Intermediate & Advanced SEO | | ang
          0
        • Allie_Williams

          Priority Attribute in XML Sitemaps - Still Valid?

          Is the priority value (scale of 0-1) used for each URL in an XML sitemap still a valid way of communicating to search engines which content you (the webmaster) believe is more important relative to other content on your site? I recall hearing that this was no longer used, but can't find a source. If it is no longer used, what are the easiest ways to communicate our preferences to search engines? Specifically, I'm looking to preference the most version version of a product's documentation (version 9) over the previous version (version 8). Thanks!

          Intermediate & Advanced SEO | | Allie_Williams
          0
        • MedGroupMedia

          Removing index.php

          I have question for the community and whether or not this is a good or bad idea. I currently have a Joomla site that displays www.domain.com/index.php in all the URLs with the exception of the home page.  I have read that it's better to not have index.php showing in the URL at all.  Does it really matter if I have index.php in my URL?  I've read that it is a bad practice. I am thinking about installing the sh404SEF component on my site and removing the index.php.  However, I rank pretty high for the keywords I want in Google, Bing and Yahoo.  All of the URLs that show up in the searches have index.php as part of the URL. Has anyone ever used sh404SEF to remove the index.php and how did you overcome not loosing your search engine links?  I don't want an existing search showing www.domain.com/index.php/sales and it not linking to the correct page which would now be www.domain.com/sales.  I guess I could insert the proper redirects in the htaccess file.  But I was hoping to avoid having every page of my site in the htaccess file for redirecting. Any help or advice appreciated.

          Intermediate & Advanced SEO | | MedGroupMedia
          0
        • Lakshdeep

          XML Sitemap index within a XML sitemaps index

          We have a similar problem to http://www.seomoz.org/q/can-a-xml-sitemap-index-point-to-other-sitemaps-indexes Can a XML sitemap index point to other sitemaps indexes? According to the "Unique Doll Clothing" example on this link, it seems possible http://www.seomoz.org/blog/multiple-xml-sitemaps-increased-indexation-and-traffic Can someone share an XML Sitemap index within a XML sitemaps index example? We are looking for the format to implement the same on our website.

          Intermediate & Advanced SEO | | Lakshdeep
          0
        • James77

          Is it possible to Spoof Analytics to give false Unique Visitor Data for Site A to Site B

          Hi, We are working as a middle man between our client (website A) and another website (website B) where, website B is going to host a section around websites A products etc. The deal is that Website A (our client) will pay Website B based on the number of unique visitors they send them. As the middle man we are in charge of monitoring the number of Unique visitors sent though and are going to do this by monitoring Website A's analytics account and checking the number of Unique visitors sent. The deal is worth quite a lot of money, and as the middle man we are responsible for making sure that no funny business goes on (IE false visitors etc). So to make sure we have things covered - What I would like to know is 1/. Is it actually possible to fool analytics into reporting falsely high unique visitors from Webpage A to Site B (And if so how could they do it). 2/. What could we do to spot any potential abuse (IE is there an easy way to spot that these are spoofed visitors). Many thanks in advance

          Intermediate & Advanced SEO | | James77
          0

        Get started with Moz Pro!

        Unlock the power of advanced SEO tools and data-driven insights.

        Start my free trial
        Products
        • Moz Pro
        • Moz Local
        • Moz API
        • Moz Data
        • STAT
        • Product Updates
        Moz Solutions
        • SMB Solutions
        • Agency Solutions
        • Enterprise Solutions
        • Digital Marketers
        Free SEO Tools
        • Domain Authority Checker
        • Link Explorer
        • Keyword Explorer
        • Competitive Research
        • Brand Authority Checker
        • Local Citation Checker
        • MozBar Extension
        • MozCast
        Resources
        • Blog
        • SEO Learning Center
        • Help Hub
        • Beginner's Guide to SEO
        • How-to Guides
        • Moz Academy
        • API Docs
        About Moz
        • About
        • Team
        • Careers
        • Contact
        Why Moz
        • Case Studies
        • Testimonials
        Get Involved
        • Become an Affiliate
        • MozCon
        • Webinars
        • Practical Marketer Series
        • MozPod
        Connect with us

        Contact the Help team

        Join our newsletter
        Moz logo
        © 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
        • Accessibility
        • Terms of Use
        • Privacy

        Looks like your connection to Moz was lost, please wait while we try to reconnect.