The Moz Q&A Forum

    • Forum
    • Questions
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Forum
    2. Categories
    3. SEO Tactics
    4. Intermediate & Advanced SEO
    5. How Do I Generate a Sitemap for a Large Wordpress Site?

    Moz Q&A is closed.

    After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

    How Do I Generate a Sitemap for a Large Wordpress Site?

    Intermediate & Advanced SEO
    6 4 3.8k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • alloydigital
      alloydigital last edited by

      Hello Everyone!

      I am working with a Wordpress site that is in Google news (i.e. everyday we have about 30 new URLs to add to our sitemap) The site has years of articles, resulting in about 200,000 pages on the site. Our strategy so far has been use a sitemap plugin that only generates the last few months of posts, however we want to improve our SEO and submit all the URLs in our site to search engines.

      The issue is the plugins we've looked at generate the sitemap on-the-fly. i.e. when you request the sitemap, the plugin then dynamically generates the sitemap. Our site is so large that even a single request for our sitemap.xml ties up tons of server resources and takes an extremely long time to generate the sitemap (if the page doesn't time out in the process).

      Does anyone have a solution?

      Thanks,

      Aaron

      1 Reply Last reply Reply Quote 0
      • FedeEinhorn
        FedeEinhorn @ThompsonPaul last edited by

        In my case, xml-sitempas works extremely good. I fully understand that a DB solution would avoid the crawl need, but the features that I get from xml-sitemaps are worth it.

        I am running my website on a powerful dedicated server with SSDs, so perhaps that's why I'm not getting any problems plus I set limitations on the generator memory consumption and activated the feature that saves temp files just in case the generation fails.

        1 Reply Last reply Reply Quote 0
        • ThompsonPaul
          ThompsonPaul @FedeEinhorn last edited by

          My concern with recommending xml-sitemaps was that I've always had problems getting good, complete maps of extremely large sites. An internal CMS-based tool is grabbing pages straight from the database instead of having to crawl for them.

          You've found that it gets you a pretty complete crawl of your 5K-page site, Federico?

          FedeEinhorn 1 Reply Last reply Reply Quote 0
          • FedeEinhorn
            FedeEinhorn last edited by

            I would go with the paid solution of xml-sitemaps.

            You can set all the resources that you want it to have available, and it will store in temp files to avoid excessive consumption.

            It also offers settings to create large sitemaps using a sitemap_index and you could get plugins that create the news sitemap automatically looking for changes since the last sitemap generation.

            I have it running in my site with 5K pages (excluding tag pages) and it takes 10 minutes to crawl.

            Then you also have plugins that create the sitemaps dynamically, like SEO by Yoast, Google XML Sitemaps, etc.

            ThompsonPaul 1 Reply Last reply Reply Quote 0
            • ThompsonPaul
              ThompsonPaul last edited by

              I think the solution to your server resource issue is to create multiple sitemaps, Aaron. Given that the sitemap protocol only allows 50,000 URLs max. per sitemap and Google News sitemaps can't be over 1000 URLs, this was going to be a necessity anyway, so may as well use these limitations to your advantage.

              There's a functionality available for sitemaps called a sitemap index. It basically lists all the sitemap.xmls you've created, so the search engines can find and index them. You put it at the root of the site and then link to it in robots.txt just like a regular sitemap. (Can also submit it in GWT). In fact, Yoast's SEO plugin sitemaps and others use just this functionality already for their News add-on.

              In your case, you could build the News sitemap dynamically to meet its special requirements (up to 1000 URLs and will crawl only last 2 days of posts) and to ensure it's up-to-the-minute accurate, as is critical for news sites.

              Then separately  you would build additional, segmented sitemaps for the existing 200,000 pages. Since these are historical pages, you could easily serve them from static files, since they wouldn't need to update once created. By having them static, there's be no server load to serve them each time - only the load to generate the current news sitemap. (I'd actually recommend you keep each static sitemap to around 25,000 pages each to ensure search engines can crawl them easily)

              This approach would involve a bit of fiddling to initially set up, as you'd need to generate the "archive" sitemaps then convert them to static versions, but once set up, the News sitemap would take care of itself and once a month (or whatever you decide) you'd need to add the "expiring" pages from the News sitemap to the most recent "archive" segment. A smart programmer might even be able to automate that process.

              Does this approach sound like it might solve your problem?

              Paul

              P.S. Since you'd already have the sitemap index capability, you could also add video and image sitemaps to your site if appropriate.

              1 Reply Last reply Reply Quote 2
              • jesse-landry
                jesse-landry last edited by

                Have you ever tried using a web-based sitemap generator? Not sure how it would respond to your site but at least it would be running on someone else's server, right?

                Not sure what else to say honestly.

                1 Reply Last reply Reply Quote 0
                • 1 / 1
                • First post
                  Last post

                Got a burning SEO question?

                Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


                Start my free trial


                Explore more categories

                • Moz Tools

                  Chat with the community about the Moz tools.

                  Getting Started
                  Moz Pro
                  Moz Local
                  Moz Bar
                  API
                  What's New

                • SEO Tactics

                  Discuss the SEO process with fellow marketers

                  Content Development
                  Competitive Research
                  Keyword Research
                  Link Building
                  On-Page Optimization
                  Technical SEO
                  Reporting & Analytics
                  Intermediate & Advanced SEO
                  Image & Video Optimization
                  International SEO
                  Local SEO

                • Community

                  Discuss industry events, jobs, and news!

                  Moz Blog
                  Moz News
                  Industry News
                  Jobs and Opportunities
                  SEO Learn Center
                  Whiteboard Friday

                • Digital Marketing

                  Chat about tactics outside of SEO

                  Affiliate Marketing
                  Branding
                  Conversion Rate Optimization
                  Web Design
                  Paid Search Marketing
                  Social Media

                • Research & Trends

                  Dive into research and trends in the search industry.

                  SERP Trends
                  Search Behavior
                  Algorithm Updates
                  White Hat / Black Hat SEO
                  Other SEO Tools

                • Support

                  Connect on product support and feature requests.

                  Product Support
                  Feature Requests
                  Participate in User Research

                • See all categories

                • XML sitemap generator only crawling 20% of my site
                  TyEl
                  TyEl
                  0
                  12
                  2.9k

                • XML Sitemap Index Percentage (Large Sites)
                  danng
                  danng
                  0
                  2
                  1.5k

                Get started with Moz Pro!

                Unlock the power of advanced SEO tools and data-driven insights.

                Start my free trial
                Products
                • Moz Pro
                • Moz Local
                • Moz API
                • Moz Data
                • STAT
                • Product Updates
                Moz Solutions
                • SMB Solutions
                • Agency Solutions
                • Enterprise Solutions
                • Digital Marketers
                Free SEO Tools
                • Domain Authority Checker
                • Link Explorer
                • Keyword Explorer
                • Competitive Research
                • Brand Authority Checker
                • Local Citation Checker
                • MozBar Extension
                • MozCast
                Resources
                • Blog
                • SEO Learning Center
                • Help Hub
                • Beginner's Guide to SEO
                • How-to Guides
                • Moz Academy
                • API Docs
                About Moz
                • About
                • Team
                • Careers
                • Contact
                Why Moz
                • Case Studies
                • Testimonials
                Get Involved
                • Become an Affiliate
                • MozCon
                • Webinars
                • Practical Marketer Series
                • MozPod
                Connect with us

                Contact the Help team

                Join our newsletter
                Moz logo
                © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                • Accessibility
                • Terms of Use
                • Privacy

                Looks like your connection to Moz was lost, please wait while we try to reconnect.