XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://moz.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Old site penalised, we moved: Shall we cut loose from the old site. It's curently 301 to new site.
Hi, We had a site with many bad links pointing to it (.co.uk). It was knocked from the SERPS. We tried to manually ask webmasters to remove links.Then submitted a Disavow and a recon request. We have since moved the site to a new URL (.com) about a year ago. As the company needed it's customer to find them still. We 301 redirected the .co.uk to the .com There are still lots of bad links pointing to the .co.uk. The questions are: #1 Do we stop the 301 redirect from .co.uk to .com now? The .co.uk is not showing in the rankings. We could have a basic holding page on the .co.uk with 'we have moved' (No link). Or just switch it off. #2 If we keep the .co.uk 301 to the .com, shall we upload disavow to .com webmasters tools or .co.uk webmasters tools. I ask this because someone else had uploaded the .co.uk's disavow list of spam links to the .com webmasters tools. Is this bad? Thanks in advance for any advise or insight!
Intermediate & Advanced SEO | | SolveWebMedia0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Crawl budget
I am a believer in this concept, showing google less pages will increase their importance. here is my question: I manage a website with millions of pages, high organic traffic (lower than before). I do believe that too many pages are crawled. there are pages that I do not need google to crawl and followed. noindex follow does not save on the mentioned crawl budget. deleting those pages is not possible. any advice will be appreciated. If I disallow those pages I am missing on pages that help my important pages.
Intermediate & Advanced SEO | | ciznerguy2 -
New site causes massive drop off in ranking, old site restored how long to recover?
Hello, We launched and updated version of our site, mainly design changes and some functionality. 3 days after the launch we vanished from the rankings, previous page one results were now out of the top 100. We have identified some of the issues with the new site and chose to restore the old well ranking site. My question is how long might it take for the ranking to come back, if at all? The drop happened on the third day and the site was restored on the third day. We are now on day 6. Using GWT with have used fetch as Google and resubmitted the site map. Any help would be gladly received. Thanks James
Intermediate & Advanced SEO | | JamesBryant0 -
Site rankings down
Our site is over 10 years old and has consistently ranked highly in google.co.uk for over 100 key phrases. Until the middle of April, we were 7th for 'nuts and bolts' and 5th for 'bolts and nuts' - we have been around these positions for 5-6 years easily now. Our rankings dropped mid-April, but now (presumably as a result of Penguin 2.0), we've seen larger decreases across the board. We are now 5th page on 'nuts and bolts', and second page on 'bolts and nuts'. Can anyone please shed any light on this? Although we'd fallen some before Penguin 2.0, we've fallen quite a bit further since. So I'm wondering if it's that. We do still rank well on our more specialised terms though - 'imperial bolts', 'bsw bolts', 'bsf bolts', we're still top 5. We've lost out with the more generic terms. In the past we did a bit of (relevant) blog commenting and obtained some business directory links, before realising the gain was tiny if at all. Are those likely to be the issue? I'm guessing so. It's hard to know which to get rid of though! Now, I use social media sparingly, just Facebook, Twitter and G+. The only linkbuilding I do now is by sending polite emails to people who run classic car clubs that would use our bolts, stuff like that. I've had a decent response from that, and a few have become customers directly. Here's our link profile if anyone would be kind enough as to have a look: http://www.opensiteexplorer.org/links?site=www.thomassmithfasteners.com Also, SEOMOZ says we have too many links on our homepage (107) - the dropdown navigation is the culprit here. Should I simply get rid of the dropdown and take users to the categories? Any advice here would be appreciated before I make changes! If anyone wants to take a look at the site, the URL is in the link profile above - I'm terrified of posting links anywhere now! Thanks for your time, and I'd be very grateful for any advice. Best Regards, Stephen
Intermediate & Advanced SEO | | stephenshone1 -
Does Google index more than three levels down if the XML sitemap is submitted via Google webmaster Tools?
We are building a very big ecommerce site. The site has 1000 products and has many categories/levels. The site is still in construccion so you cannot see it online. My objective is to get Google to rank the products (level 5) Here is an example level 1 - Homepage - http://vulcano.moldear.com.ar/ Level 2 - http://vulcano.moldear.com.ar/piscinas/ Level 3 - http://vulcano.moldear.com.ar/piscinas/electrobombas-para-piscinas/ Level 4 - http://vulcano.moldear.com.ar/piscinas/electrobombas-para-piscinas/autocebantes.html/ Level 5 - Product is on this level - http://vulcano.moldear.com.ar/piscinas/electrobombas-para-piscinas/autocebantes/autocebante-recomendada-para-filtros-vc-10.html Thanks
Intermediate & Advanced SEO | | Carla_Dawson0 -
Sitemaps recommend by google
Google in it guideline recommends to create a sitemap. Do they means a /sitemap.xml or does it need to be sitemap directly on the website ? Does it make any difference ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Sitemap not indexing pages
My website has about 5000 pages submitted in the sitemap but only 900 being indexed. When I checked Google Webmaster Tools about a week ago 4500 pages were being indexed. Any suggestions about what happened or how to fix it? Thanks!
Intermediate & Advanced SEO | | theLotter0