Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://moz.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Spotify XML Sitemap
All, Working on an SEO work up for a Spotify site. Looks like they are using a sitemap that links to additional pages. A problem, none of the links are actually linked within the sitemap. This feels like a strong error. https://lubricitylabs.com/sitemap.xml Thoughts?
Intermediate & Advanced SEO | | dmaher0 -
This url is not allowed for a Sitemap at this location error using pro-sitemaps.com
Hey, guys, We are using the pro-sitemaps.com tool to automate our sitemaps on our properties, but some of them give this error "This url is not allowed for a Sitemap at this location" for all the urls. Strange thing is that not all of them are with the error and most have all the urls indexed already. Do you have any experience with the tool and what is your opinion? Thanks
Intermediate & Advanced SEO | | lgrozeva0 -
Why do people put xml sitemaps in subfolders? Why not just the root? What's the best solution?
Just read this: "The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/." here: http://www.sitemaps.org/protocol.html#location Yet surely it's better to put the sitemaps at the root so you have:
Intermediate & Advanced SEO | | McTaggart
(a) http://example.com/sitemap.xml
http://example.com/sitemap-chocolatecakes.xml
http://example.com/sitemap-spongecakes.xml
and so on... OR this kind of approach -
(b) http://example/com/sitemap.xml
http://example.com/sitemap/chocolatecakes.xml and
http://example.com/sitemap/spongecakes.xml I would tend towards (a) rather than (b) - which is the best option? Also, can I keep the structure the same for sitemaps that are subcategories of other sitemaps - for example - for a subcategory of http://example.com/sitemap-chocolatecakes.xml I might create http://example.com/sitemap-chocolatecakes-cherryicing.xml - or should I add a sub folder to turn it into http://example.com/sitemap-chocolatecakes/cherryicing.xml Look forward to reading your comments - Luke0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Where is the best place to put a sitemap for a site with local content?
I have a simple site that has cities as subdirectories (so URL is root/cityname). All of my content is localized for the city. My "root" page simply links to other cities. I very specifically want to rank for "topic" pages for each city and I'm trying to figure out where to put the sitemap so Google crawls everything most efficiently. I'm debating the following options, which one is better? Put the sitemap on the footer of "root" and link to all popular pages across cities. The advantage here is obviously that the links are one less click away from root. Put the sitemap on the footer of "city root" (e.g. root/cityname) and include all topics for that city. This is how Yelp does it. The advantage here is that the content is "localized" but the disadvantage is it's further away from the root. Put the sitemap on the footer of "city root" and include all topics across all cities. That way wherever Google comes into the site they'll be close to all topics I want to rank for. Thoughts? Thanks!
Intermediate & Advanced SEO | | jcgoodrich0 -
XML Sitemap for classifieds
I have seeon some trends for sites which do not even use XML sitemp and robots e.g. see this site. How do you see if sitemap is not used. Also for classified websites, should ad pages be included in sitemap because after certain duration those ads will be deleted and google might not be able to crawl. What do you suggest about XML sitemap for classified website.
Intermediate & Advanced SEO | | MozAddict0 -
Is it possible to Spoof Analytics to give false Unique Visitor Data for Site A to Site B
Hi, We are working as a middle man between our client (website A) and another website (website B) where, website B is going to host a section around websites A products etc. The deal is that Website A (our client) will pay Website B based on the number of unique visitors they send them. As the middle man we are in charge of monitoring the number of Unique visitors sent though and are going to do this by monitoring Website A's analytics account and checking the number of Unique visitors sent. The deal is worth quite a lot of money, and as the middle man we are responsible for making sure that no funny business goes on (IE false visitors etc). So to make sure we have things covered - What I would like to know is 1/. Is it actually possible to fool analytics into reporting falsely high unique visitors from Webpage A to Site B (And if so how could they do it). 2/. What could we do to spot any potential abuse (IE is there an easy way to spot that these are spoofed visitors). Many thanks in advance
Intermediate & Advanced SEO | | James770 -
Online Sitemap Generator
I have a site that has around 5,000 pages now. Are there any recommened online free/paid tools to generate a sitemap for me?
Intermediate & Advanced SEO | | rhysmaster0