Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://moz.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My last site crawl shows over 700 404 errors all with void(0 added to the ends of my posts/pages.
Hello, My last site crawl shows over 700 404 errors all with void(0 added to the ends of my posts/pages. I have contacted my theme company but not sure what could have done this. Any ideas? The original posts/pages are still correct and working it just looks like it did duplicates and added void(0 to the end of each post/page. Questions: There is no way to undo this correct? Do I have to do a redirect on each of these? Will this hurt my rankings and domain authority? Any suggestions would be appreciated. Thanks, Wade
Intermediate & Advanced SEO | | neverenoughmusic.com0 -
This url is not allowed for a Sitemap at this location error using pro-sitemaps.com
Hey, guys, We are using the pro-sitemaps.com tool to automate our sitemaps on our properties, but some of them give this error "This url is not allowed for a Sitemap at this location" for all the urls. Strange thing is that not all of them are with the error and most have all the urls indexed already. Do you have any experience with the tool and what is your opinion? Thanks
Intermediate & Advanced SEO | | lgrozeva0 -
Google cache is showing my UK homepage site instead of the US homepage and ranking the UK site in US
Hi There, When I check the cache of the US website (www.us.allsaints.com) Google returns the UK website. This is also reflected in the US Google Search Results when the UK site ranks for our brand name instead of the US site. The homepage has hreflang tags only on the homepage and the domains have been pointed correctly to the right territories via Google Webmaster Console.This has happened before in 26th July 2015 and was wondering if any had any idea why this is happening or if any one has experienced the same issueFDGjldR
Intermediate & Advanced SEO | | adzhass0 -
Where is the best place to put a sitemap for a site with local content?
I have a simple site that has cities as subdirectories (so URL is root/cityname). All of my content is localized for the city. My "root" page simply links to other cities. I very specifically want to rank for "topic" pages for each city and I'm trying to figure out where to put the sitemap so Google crawls everything most efficiently. I'm debating the following options, which one is better? Put the sitemap on the footer of "root" and link to all popular pages across cities. The advantage here is obviously that the links are one less click away from root. Put the sitemap on the footer of "city root" (e.g. root/cityname) and include all topics for that city. This is how Yelp does it. The advantage here is that the content is "localized" but the disadvantage is it's further away from the root. Put the sitemap on the footer of "city root" and include all topics across all cities. That way wherever Google comes into the site they'll be close to all topics I want to rank for. Thoughts? Thanks!
Intermediate & Advanced SEO | | jcgoodrich0 -
Should I noindex the site search page? It is generating 4% of my organic traffic.
I read about some recommendations to noindex the URL of the site search.
Intermediate & Advanced SEO | | lcourse
Checked in analytics that site search URL generated about 4% of my total organic search traffic (<2% of sales). My reasoning is that site search may generate duplicated content issues and may prevent the more relevant product or category pages from showing up instead. Would you noindex this page or not? Any thoughts?0 -
Where to link to HTML Sitemap?
After searching this morning and finding unclear answers I decided to ask my SEOmoz friends a few questions. Should you have an HTML sitemap? If so, where should you link to the HTML sitemap from? Should you use a noindex, follow tag? Thank you
Intermediate & Advanced SEO | | cprodigy290 -
Franchise sites on subdomains
I've been asked by a client to optimise a a webpage for a location i.e. London. Turns out that the location is actually a franchise of the main company. When the company launch a new franchise, so far they have simply added a new page to the main site, for example: mysite.co.uk/sub-folder/london They have so far done this for 10 or so franchises and task someone with optimising that page for their main keyword + location. I think I know the answer to this, but would like to get a back up / additional info on it in terms of ranking / seo benefits. I am going to suggest the idea of using a subdomain for each location, example: london.mysite.co.uk Would this be the correct approach. If you think yes, why? Many thanks,
Intermediate & Advanced SEO | | Webrevolve0 -
Does Google crawl the pages which are generated via the site's search box queries?
For example, if I search for an 'x' item in a site's search box and if the site displays a list of results based on the query, would that page be crawled? I am asking this question because this would be a URL that is non existent on the site and hence am confused as to whether Google bots would be able to find it.
Intermediate & Advanced SEO | | pulseseo0