Sitemaps: Best Practice
-
What should and what shouldn't go in the sitemap?
In particular, pages like subscribe to our newsletter/ unsubscribe to our newsletter? Is there really any benefit in highlighting those pages to the SEs?
Thanks for any advice/ anecdotes
-
So, sometimes, people think adding a sitemap to their company website, is something thats very difficult to do.
for example, they may think they need a web designer to do this for them, yet often you can do it yourself, its very simple.
so if your business has a WordPress website, then it can be a piece of cake to add a site map.
If you use Yoast, its a free plugin, , you can add a site map very easily to your website, which you can then send to your site map to Google Search Console for indexing .
We did this for a large garden room company within the city of Bristol, and what happens is that it makes sure every single page and blog post is indexed.
-
Pages that I like to call 'core' site URLs should go in your sitemap. Basically, unique (canonical) pages which are not highly duplicate, which Google would wish to rank
I would include core addresses
I wouldn't include uploaded documents, installers, archives, resources (images, JS modules, CSS sheets, SWF objects), pagination URLs or parameter based children of canonical pages (e.g: example.com/some-page is ok to rank, but not example.com/some-page?tab=tab3). Parameters are additional funky stuff added to URLs following "?" or "&".
There are exceptions to these rules, some sites use parameters to render their on-page content - even for canonical addresses. Those old architecture types are fast dying out, though. If you're on WordPress I would index categories, but not tags which are non-hierarchical and messy (they really clutter up your SERPs)
Try crawling your site using Screaming Frog. Export all the URLs (or a large sample of them) into an Excel file. Filter the file, see which types of addresses exist on your site and which technologies are being used. Feed Google the unique, high-value pages that you know it should be ranking
I have said not to feed pagination URLs to Google, that doesn't mean they should be completely de-indexed. I just think that XML sitemaps should be pretty lean and streamlined. You can allow things which aren't in your XML sitemap to have a chance of indexation, but if you have used something like a Meta no-index tag or a robots.txt edit to block access to a page - **do not **then feed it to Google in your XML. Try to keep **all **of your indexation modules in line with each other!
No page which points to another, separate address via a canonical tag (thus calling itself 'non-canonical') should be in your XML sitemap. No page that is blocked via Meta no-index or Robots.txt should be in your sitemap.XML either
If you end up with too many pages, think about creating a sitemap XML index instead, which links through to other, separate sitemap files
Hope that helps!
-
To further on from this, we have some parameter urls in our sitemap which make me uneasy. should url.com/blah.html?option=1 be in the sitemap? If so, what benefit is that giving us?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL indexed but not submitted in sitemap, however the URL is in the sitemap
Dear Community, I have the following problem and would be super helpful if you guys would be able to help. Cheers Symptoms : On the search console, Google says that some of our old URLs are indexed but not submitted in sitemap However, those URLs are in the sitemap Also the sitemap as been successfully submitted. No error message Potential explanation : We have an automatic cache clearing process within the company once a day. In the sitemap, we use this as last modification date. Let's imagine url www.example.com/hello was modified last time in 2017. But because the cache is cleared daily, in the sitemap we will have last modified : yesterday, even if the content of the page did not changed since 2017. We have a Z after sitemap time, can it be that the bot does not understands the time format ? We have in the sitemap only http URL. And our HTTPS URLs are not in the sitemap What do you think?
Intermediate & Advanced SEO | | ZozoMe0 -
Best Format to Index a Large Data Set
Hello Moz, I've been working on a piece of content that has 2 large data sets I have organized into a table that I would like indexed and want to know the best way to code the data for search engines while still providing a good visual experience for users. I actually created the piece 3 times and am deciding on which format to go with and I would love your professional opinions. 1. HTML5 - all the data is coded using tags and contains all the data on page in the . This is the most straight forward method and I know this will get indexed; however, it is also the ugliest looking table and least functional. 2. Java - I used google charts and loaded all the data into a
Intermediate & Advanced SEO | | jwalker880 -
How Should We Best List Events Pages?
Hi everyone! Luke here from CHARGED.fm hoping that a brilliant mind could help me with another annoying (at least for me) technical seo question. It's about how we list the events on our ticketing site. Here's the rundown: We currently list tickets by event id, but our competitors keep the event page in the same silo and use the venue name and date of event in the url. So we do this: http://www.charged.fm/kinky-boots-tickets (disregard redirect for now) List the events where you can choose from these: http://www.charged.fm/event/tickets/2518362/kinky-boots
Intermediate & Advanced SEO | | keL.A.xT.o
http://www.charged.fm/event/tickets/2511448/kinky-boots Moz lists these as duplicate content, so we're wondering how to resolve this. We're also wondering if it would be benficial to keep the events page in the same silo like our competitors: http://www.vividseats.com/theatre/kinky-boots-tickets/kinky-boots-9-20-1537274.html (notice how they go /theatre/kinky-boots-tickets/event/) Would it be beneficial to list like this? Is it inconsequential? Could we leave things the way that they are or should we at least add the venue and date to the events page URL? Thanks a lot for any help,
Luke0 -
Best software or online tool to create Videos
Kindly let me know best software or online tool to create videos for website and youtube. I am also looking for animation videos etc.
Intermediate & Advanced SEO | | AlexanderWhite0 -
What are partial urls and why this is causing a sitemap error?
Hi mozzers, I have a client that recorded 7 errors when generating Xml sitemap. One of the errors appear to be coming from partial urls and apparently I would need to exclude them from sitemap. What are they exactly and why would they cause an error in the sitemap. Thanks!
Intermediate & Advanced SEO | | Ideas-Money-Art0 -
Urls missing from product_cat sitemap
I'm using Yoast SEO plugin to generate XML sitemaps on my e-commerce site (woocommerce). I recently changed the category structure and now only 25 of about 75 product categories are included. Is there a way to manually include urls or what is the best way to have them all indexed in the sitemap?
Intermediate & Advanced SEO | | kisen0 -
How to create XML sitemap for larger website?
We need to create XML sitemap for a website that has more than 2 million pages. Please suggest me the best software to create XML sitemap for the website. Since there are different strategies that larger websites submit sitemaps, let me know the best way to submit this sitemap for website of this size. Or Is there any tool provided by SEOmoz for XML sitemap generation for larger websites?
Intermediate & Advanced SEO | | DCISEO0 -
What is the best way to embed PDF documents for SEO?
I have been using SCRIBD to embed PDF documents on my site but until recently I did not include the link back to SCRIBD. Will my site get credit for this content or will it go to SCRIBD? Is there a better way to embed PDF documents for SEO?
Intermediate & Advanced SEO | | casper4340