Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://moz.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pending Sitemaps
Hi, all Wondering if someone could give me a pointer or two, please. I cannot seem to get Google or Bing to crawl my sitemap. If I submit the sitemap in WMT and test it I get a report saying 44,322urls found. However, if I then submit that same sitemap it either says Pending (in old WMT) or Couldn't fetch in the new version. This couldn't fetch is very puzzling as it had no issue fetching the map to test it. My other domains on the same server are fine, the problem is limited to this one site. I have tried several pages on the site using the Fetch as Google tool and they load without issue, however, try as I may, it will not fetch my sitemap. The sitemapindex.xml file won't even submit. I can confirm my sitemaps, although large, work fine, please see the following as an example (minus the spaces, of course, didn't want to submit and make it look like I was just trying to get a link) https:// digitalcatwalk .co.uk/sitemap.xml https:// digitalcatwalk .co.uk/sitemapindex.xml I would welcome any feedback anyone could offer on this, please. It's driving me mad trying to work out what is up. Many thanks, Jeff
Intermediate & Advanced SEO | | wonkydogadmin0 -
URL structure change and xml sitemap
At the end of April we changed the url structure of most of our pages and 301 redirected the old pages to the new ones. The xml sitemaps were also updated at that point to reflect the new url structure. Since then Google has not indexed the new urls from our xml sitemaps and I am unsure of why. We are at 4 weeks since the change, so I would have thought they would have indexed the pages by now. Any ideas on what I should check to make sure pages are indexed?
Intermediate & Advanced SEO | | ang0 -
Should I include URLs that are 301'd or only include 200 status URLs in my sitemap.xml?
I'm not sure if I should be including old URLs (content) that are being redirected (301) to new URLs (content) in my sitemap.xml. Does anyone know if it is best to include or leave out 301ed URLs in a xml sitemap?
Intermediate & Advanced SEO | | Jonathan.Smith0 -
Adult Toys Sites
Does anyone know of any changes SEOwise when running an adult toy site versus a normal eCommerce site? Is there any tips or suggestions that are worth knowing to achieve rankings faster? Thanks,
Intermediate & Advanced SEO | | the-gate-films0 -
Canonical URLs and Sitemaps
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external). Questions: 1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags? 2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
Intermediate & Advanced SEO | | 379seo0 -
Tool to check XML sitemap
Hello, Can anyone help me finding a tool to have closer look of the XML sitemap? Tks in advance! PP
Intermediate & Advanced SEO | | PedroM0 -
Sitemap in SERPS
What's up guys, Having some troubles with SERP rankings. My sitemap (navigation) is appearing instead of my actual keywords. I have tried a few methods to fix this; setting a preferred domain, using a 301 redirects, deleting out of date pages via Google webmaster tools. Nothing seems to work. My next step was to refresh the cache for my entire site - does anyone know how to do this? Can't see any tools... Any help would be great. Cheers, Jon.
Intermediate & Advanced SEO | | jamesjk240 -
Online Sitemap Generator
I have a site that has around 5,000 pages now. Are there any recommened online free/paid tools to generate a sitemap for me?
Intermediate & Advanced SEO | | rhysmaster0