XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://moz.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Mirror from my site
hi to all i find 2 site they do mirror my site and send back link to all my pages. do you thing its bad for my seo ?? my site is https://android-apk.org mirror sites | Who links the most | fryeboysent.com | 1,342,613 |
Intermediate & Advanced SEO | | moztabliq1
| ficyexp.cl | 934,654 | |0 -
Best Sitemap Generator XML
Hello Everyone, Can Anyone Suggest best Site map Generator Software??
Intermediate & Advanced SEO | | ieplnupur0 -
Transferring Domain and redirecting old site to new site and Having Issues - Please help
I have just completed a site redesign under a different domain and new wordpress woo commerce platform. The typical protocol is to just submit all the redirects via the .htaccess file on the current site and thereby tell google the new home of all your current pages on the new site so you maintain your link juice. This problem is my current site is hosted with network solutions and they do not allow access to the .htaccess file and there is no way to redirect the pages they say other than a script they can employ to push all pages of the old site to the new home page of the new site. This is of course bad for seo so not a solution. They did mention they could also write a script for the home page to redirect just it to the new home page then place a script of every individual page redirecting each of those. Does this sound like something plausible? Noone at network solutions has really been able to give me a straight answer. That being said i have discussed with a few developers and they mentioned a workaround process to avoid the above: “The only thing I can think of is.. point both domains (www.islesurfboards.com & www.islesurfandsup.com) to the new store, and 301 there? If you kept WooCommerce, Wordpress has plugins to 301 pages. So maybe use A record or CName for the old URL to the new URL/IP, then use htaccess to redirect the old domain to the new domain, then when that comes through to the new store, setup 301's there for pages? Example ... http://www.islesurfboards.com points to http://www.islesurfandsup.com ... then when the site sees http://www.islesurfboards.com, htaccess 301's to http://www.islesurfandsup.com.. then wordpress uses 301 plugin for the pages? Not 100% sure if this is the best way... but might work." Can anyone confirm this process will work or suggest anything else to redirect my current site on network solutions to my new site withe new domain and maintain the redirects and seo power. My domain www.islesurfboards.com has been around for 10 years so dont just want to flush the link juice down the toilet and want to redirect everything correctly.
Intermediate & Advanced SEO | | isle_surf0 -
B2B site targeting 20,000 companies with 20,000 dedicated "target company pages" on own website.
An energy company I'm working with has decided to target 20,000 odd companies on their own b2b website, by producing a new dedicated page per target company on their website - each page including unique copy and a sales proposition (20,000 odd new pages to optimize! Yikes!). I've never come across such an approach before... what might be the SEO pitfalls (other than that's a helluva number of pages to optimize!). Any thoughts would be very welcome.
Intermediate & Advanced SEO | | McTaggart0 -
Linking from a corporate site to a brand site.
Is there an SEO impact to a large corporation linking from a corporate and/or a divisional site to a specific brand site with it's own top level domain? We would like to keep the traffic coming, but not if it will be seen as a black hat tactic. My guess is that Google will be smart enough to see that the corporation owns the brand and at least not penalize us, but I am wondering if anyone else has this experience? Google Analytics is calling it self-referral.
Intermediate & Advanced SEO | | mrbobland0 -
Google Processing but Not Indexing XML Sitemap
Like it says above, Google is processing but not indexing our latest XML sitemap. I noticed this Monday afternoon - Indexed status was still Pending - and didn't think anything of it. But when it still said Pending on Tuesday, it seemed strange. I deleted and resubmitted our XML sitemap on Tuesday. It now shows that it was processed on Tuesday, but the Indexed status is still Pending. I've never seen this much of a lag, hence the concern. Our site IS indexed in Google - it shows up with a site:xxxx.com search with the same number of pages as it always has. The only thing I can see that triggered this is Sunday the site failed verification via Google, but we quickly fixed that and re-verified via WMT Monday morning. Anyone know what's going on?
Intermediate & Advanced SEO | | Kingof50 -
Troubled QA Platform - Site Map vs Site Structure
I'm running a Q&A forum that was built prioritizing UX over SEO. This decision has cause a bit of a headache as we're 6 months into the project with 2278 Q&A pages with extremely minimal traffic coming from search engines. The structure has the following hiccups: A. The category navigation from the main Q&A page is entirely javascript and only navigable by users. B. We identify Google bots and send them to another version of the Q&A platform w/o javascript. Category links don't exist in this google bot version of the main Q&A page. On this Google version of the main Q&A page, the Pinterest-like tiles displaying individual Q&As are capped at 10. This means that the only way google bot can identify link juice being passed down to individual QAs (after we've directed them to this page) is through 10 random Q&As. C. All 2278 of the QAs are currently indexed in search. They are just indexed very very poorly in SERPs. My personal assumption, is that Google can't pass link juice to any of the Q&As (poor SERP) but registers them from the site map so it gets included in Google's index. My dilemma has me struggling between two different decisions: 1. Update the navigation in the header to remove the javascript and fundamentally change the look and feel of the Q&A platform. This will allow Google bot to navigate through Expert category links to pass link juice to all Q&As. or 2. Update the redirected main Q&A page to include hard coded category links with 100s of hard coded Q&As under each category page. Make it similar, ugly, flat and efficient for the crawling bots. Any suggestions would be greatly appreciated. I need to find a solution as soon as possible.
Intermediate & Advanced SEO | | TQContent0 -
Ranking a site in the USA
I'm UK based and looking at setting up a site to rank in the USA. As I understand it a .com TLD is best but these are used worldwide so do I simply need to set the geotargeting to USA in webmaster tools? Or is there a better domain to use? With hosting the site in US and on page content related to US cities (I plan to create a page for each US city I operate in the the city name in the H1 tag) will that be enough for google to understand that the page should rank in the US version of google. Also how can I view Google USA search results - when I go to google.com it automatically redirects to google.co.uk and I can only change the location on the left hand side to UK cities. Any help much appreciated!
Intermediate & Advanced SEO | | SamCUK0