XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://moz.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Migration Question
Hi Guys, I am preparing for a pretty standard site migration. Small business website moving to a new domain, new branding and new cms. Pretty much a perfect storm. Right now the new website is being designed and will need another month, however the client is pretty antsy to get her new brand out over the web. We cannot change the current site, which has the old branding. She wants to start passing out business cards and hang banners with the new domain and brand. However, I don't want to be messing with any redirects and potentially screw up a clean migration from the old site to the new. To be specific, she wants to redirect the new domain to the current domain and then when the new site, flip the redirect. However, I'm a little apprehensive with that because a site migration from the current to the new is already so intricate, I don't want to leave any possibility of error. I'm trying to figure out the best solution, these are 2 options I am thinking of: DO NOT market new domain. Reprint all Marketing material and wait until new domain is up and then start marketing it. (At cost to client) Create a one pager on new domain saying the site is being built & have a No Follow link to the current site. No redirects added. Just the no follow link. I'd like option 2 so that the client could start passing out material, but my number one concern is messing with any part of the migration. We are about to submit a sitemap index to Google Search Console for the current site, so we are just starting the site migration. What do you guys think?
Intermediate & Advanced SEO | | Khoo0 -
Sitemap generator which only includes canonical urls
Does anyone know of a 3rd party sitemap generator that will only include the canonical url's? Creating a sitemap with geo and sorting based parameters isn't the most ideal way to generate sitemaps. Please let me know if anyone has any ideas. Mind you we have hundreds of thousands of indexed url's and this can't be done with a simple text editor.
Intermediate & Advanced SEO | | recbrands0 -
International Site Migration
Hi guys, In the process of launching internationally ecommerce site (Magento CMS) for two different countries (Australia and US). Then later on expand to other countries like the UK, Canada, etc. The plan is for each country will have its own sub-folder e.g. www.domain.com/us, www.domain.com.au/au, www.domain.com.au/uk A lot of the content between these English based countries are the same. E.g. same product descriptions.
Intermediate & Advanced SEO | | jayoliverwright
So in order to prevent duplication, from what I’ve read we will need to add Hreflang tags to every single page on the site? So for: Australian pages: United States pages: Just wanted to make sure this is the correct strategy (will hreflang prevent duplicate content issues?) and anything else i should be considering? Thankyou, Chris0 -
Only the mobile version of the site is being indexed
We've got an interesting situation going on at the moment where a recently on-boarded clients site is being indexed and displayed, but it's on the mobile version of the site that is showing in serps. A quick rundown of the situation. Retail shopping center with approximately 200 URLS Mobile version of the site is www.mydomain.com/m/ XML sitemap submitted to Google with 202 URLs, 3 URLS indexed Doing site:www.mydomain.com in a Google search brings up the home page (desktop version) and then everything else is /m/ versions. There is no rel="canonical" on mobile site pages to their desktop counterpart (working on fixing that) We have limited CMS access, but developers are open to working with us on whatever is needed. Within desktop site source code, there are no "noindex, nofollow, etc" issues on the pages. No manual actions, link issues, etc Has anyone ever encoutnered this before? Any input or thoughts are appreciated. Thanks
Intermediate & Advanced SEO | | GregWalt0 -
SEO site Review
Does anyone have suggestions on places that provide in depth site / analytics reviews for SEO?
Intermediate & Advanced SEO | | Gordian0 -
XML question - not finding all of the pages
When I run http://www.xml-sitemaps.com/ on my site, it doesn't find all of my pages. The pages do not have any no follows in them (I thought that was the original problem). Has this happened to anyone else? What is the solution?
Intermediate & Advanced SEO | | digitalops0 -
Site dancing
Hi guys I have a site which is dancing. I mean one day is on position 20 , if I put more backlinks is falling, after rising again,, I dont know what is going on. The site is 2 years old, pr 2, authority 35. Why this is happening? Usually when he appears again is ranking higher, but today he disappear totally from rankings. Maybe return tomorrow? But anyway why is dancing? Thanks
Intermediate & Advanced SEO | | nyanainc0 -
Working out exactly how Google is crawling my site if I have loooots of pages
I am trying to work out exactly how Google is crawling my site including entry points and its path from there. The site has millions of pages and hundreds of thousands indexed. I have simple log files with a time stamp and URL that google bot was on. Unfortunately there are hundreds of thousands of entries even for one day and as it is a massive site I am finding it hard to work out the spiders paths. Is there any way using the log files and excel or other tools to work this out simply? Also I was expecting the bot to almost instantaneously go through each level eg. main page--> category page ---> subcategory page (expecting same time stamp) but this does not appear to be the case. Does the bot follow a path right through to the deepest level it can/allowed to for that crawl and then returns to the higher level category pages at a later time? Any help would be appreciated Cheers
Intermediate & Advanced SEO | | soeren.hofmayer0