Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should XML sitemaps include *all* pages or just the deeper ones?
-
Hi guys,
Ok this is a bit of a sitemap 101 question but I cant find a definitive answer:
When we're running out XML sitemaps for google to chew on (we're talking ecommerce and directory sites with many pages inside sub-categories here) is there any point in mentioning the homepage or even the second level pages? We know google is crawling and indexing those and we're thinking we should trim the fat and just send a map of the bottom level pages.
What do you think?
-
It is correct that DA, PA, depth of pages, etc. are all factors in determining which pages get indexed. If your site offers good navigation, reasonable backlinks, anchor text, etc then you can get close to all pages indexed even on a very large site.
Your site map should naturally include a date on every link which indicates when content was added or changed. Even if you submit a 10k list of links, Google can evaluate the dates on each link and determine which content has been added or modified since your site was last crawled.
-
Well yes, that's kinda my point. We do have a sensible, crawlable navigation so there will be no problems there, so then the sitemap really becomes an indicator of what needs to be crawled (new and updated pages), but then the same question stands...
With other sites we've managed with thousands of pages we've found it detrimental to give Google hundreds of pages to crawl on a sitemap that we don't feel are important. We're pretty sure (and SEOmoz staff have supported this) that domain authority and the number of pages you can get into the index are closely related.
-
Tim,
We always index ALL pages...the help tip on Google XML also suggests including all pages of your site in the XML sitemap.
-
Your sitemap should include every page of your site that you wish to be indexed.
The idea is that if your site does not provide crawlable navigation, Google can use your sitemap to crawl your site. There are some sites that use flash and when a crawler lands on a page there is absolutely no where for the crawler to go.
If your site navigation is solid then a sitemap doesn't offer any value to Google other then an indicator of when content is updated or added.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
.xml sitemap showing in SERP
Our sitemap is showing in Google's SERP. While it's only for very specific queries that don't seem to have much value (it's a healthcare website and when a doctor who isn't with us is search with the brand name so 'John Smith Brand,' it shows if there's a first or last name that matches the query), is there a way to not make the sitemap indexed so it's not showing in the SERP. I've seen the "x-robots-tag: noindex" as a possible option, but before taking any action wanted to see if this was still true and if it would work.
Technical SEO | | Kyleroe950 -
Getting high priority issue for our xxx.com and xxx.com/home as duplicate pages and duplicate page titles can't seem to find anything that needs to be corrected, what might I be missing?
I am getting high priority issue for our xxx.com and xxx.com/home as reporting both duplicate pages and duplicate page titles on crawl results, I can't seem to find anything that needs to be corrected, what am I be missing? Has anyone else had a similar issue, how was it corrected?
Technical SEO | | tgwebmaster0 -
When creating parent and child pages should key words be repeated in url and page title?
We are in the direct mail advertising business: PrintLabelAndMail.com Example: Parent:
Technical SEO | | JimDirectMailCoach
Postcard Direct Mail Children:
Postcard Mailings
Postcard Design
Postcard Samples
Postcard Pricing
Postcard Advantages should "postcard" be repeated in the URL and Page Title? and in this example should each of the 5 children link back directly to the parent or would it be better to "daisy chain" them using each as parent for the next?0 -
Can you noindex a page, but still index an image on that page?
If a blog is centered around visual images, and we have specific pages with high quality content that we plan to index and drive our traffic, but we have many pages with our images...what is the best way to go about getting these images indexed? We want to noindex all the pages with just images because they are thin content... Can you noindex,follow a page, but still index the images on that page? Please explain how to go about this concept.....
Technical SEO | | WebServiceConsulting.com0 -
Is there a way for me to automatically download a website's sitemap.xml every month?
From now on we want to store all our sitemap.xml over the next years. Its a nice archive to have that allows us to analyse how many pages we have on our website and which ones were removed/redirected. Any suggestions? Thanks
Technical SEO | | DeptAgency0 -
Two different canonical tags on one page
Due to an error, some of my pages now have two canonical tags on them. One is correct and the other goes to a nonsense URL (404 page). I know I should ideally remove the incorrect ones, but it's a big manual job. Are they doing any harm? Can I just leave them there and let Google figure it out? The correct ones are higher up in the code. Will this make a difference? Any help appreciated.
Technical SEO | | ShearingsGroup0 -
Do we need to manually submit a sitemap every time, or can we host it on our site as /sitemap and Google will see & crawl it?
I realized we don't have a sitemap in place, so we're going to get one built. Once we do, I'll submit it manually to Google via Webmaster tools. However, we have a very dynamic site with content constantly being added. Will I need to keep manually re-submitting the sitemap to Google? Or could we have the continually updating sitemap live on our site at /sitemap and the crawlers will just pick it up from there? I noticed this is what SEOmoz does at http://www.seomoz.org/sitemap.
Technical SEO | | askotzko0 -
Sitmap Page - HTML and XML
Hi there I have a domain which has a sitemap in html for regular users and a sitemap in xml for the spiders. I have a warning via seomoz saying that i have too many links on the html version. What do i do here? regards Stef
Technical SEO | | stefanok0