Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Sitemaps. When compressed do you use the .gz file format or the (untidy looking, IMHO) .xml.gz format?
-
When submitting compressed sitemaps to Google I normally use the a file named sitemap.gz
A customer is banging on that his web guy says that sitemap.xml.gz is a better format.
Google spiders sitemap.gz just fine and in Webmaster Tools everything looks OK...
Interested to know other SEOmoz Pro's preferences here and also to check I haven't made an error that is going to bite me in the ass soon!
Over to you.
-
Thanks Big Bazza... I like the 'better' vs 'accepted' reasoning. Not too confrontational

-
Generally the .xml.gz format is the one stated in examples there are a few references to this here : http://www.sitemaps.org/protocol.php#index
Most sitemap generators that create both compressed and uncompressed sitemap files name them sitemap.xml and sitemap.xml.gz respectively. It also makes it clearer what the content of the zipped file is. I don't believe it is essential however, as you will direct tools such as google.com/webmasters to your xml sitemap - rather than expect it to find it of its own accord.
I always use the .xml.gz format when compressing. I would argue that (if both formats work) neither one is 'BETTER' than the other, rather one is more ACCEPTED than the other.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How important is the file extension in the URL for images?
I know that descriptive image file names are important for SEO. But how important is it to include .png, .jpg, .gif (or whatever file extension) in the url path? i.e. https://example.com/images/golden-retriever vs. https://example.com/images/golden-retriever.jpg Furthermore, since you can set the filename in the Content-Disposition response header, is there any need to include the descriptive filename in the URL path? Since I'm pulling most of our images from a database, it'd be much simpler to not care about simulating a filename, and just reference an image id in my templates. Example: 1. Browser requests GET /images/123456
Intermediate & Advanced SEO | | dsbud
2. Server responds with image setting both Content-Disposition, and Link (canonical) headers Content-Disposition: inline; filename="golden-retriever"
Link: <https: 123456="" example.com="" images="">; rel="canonical"</https:>1 -
Should I use meta noindex and robots.txt disallow?
Hi, we have an alternate "list view" version of every one of our search results pages The list view has its own URL, indicated by a URL parameter I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"... Thanks 🙂
Intermediate & Advanced SEO | | ntcma0 -
Using disavow tool for 404s
Hey Community, Got a question about the disavow tool for you. My site is getting thousands of 404 errors from old blog/coupon/you name it sites linking to our old URL structure (which used underscores and ended in .jsp). It seems like the webmasters of these sites aren't answering back or haven't updated their sites in ages so it's returning 404 errors. If I disavow these domains and/or links will it clear out these 404 errors in Google? I read the GWT help page on it, but it didn't seem to answer this question. Feel free to ask any questions that may help you understand the issue more. Thanks for your help,
Intermediate & Advanced SEO | | IceIcebaby
-Reed0 -
Best server-side sitemap generators
I've been looking into sitemap generators recently and have got a good knowledge of what creating a sitemap for a small website of below 500 URLs involves. I have successfully generated a sitemap for a very small site, but I’m trying to work out the best way of crawling a large site with millions of URLs. I’ve decided that the best way to crawl such a large number of URLs is to use a server side sitemap, but this is an area that doesn’t seem to be covered in detail on SEO blogs / forums. Could anyone recommend a good server side sitemap generator? What do you think of the automated offerings from Google and Bing? I’ve found a list of server side sitemap generators from Google, but I can’t see any way to choose between them. I realise that a lot will depend on the type of technologies we use server side, but I'm afraid that I don't know them at this time.
Intermediate & Advanced SEO | | RG_SEO0 -
Should canonical links be included or excluded in a sitemap?
Our company is in the process of updating our sitemap. Should we include or exclude canonical links.
Intermediate & Advanced SEO | | WebRiverGroup0 -
When using ALT tags - are spaces, hyphens or underscores preferred by Google when using multiple words?
when plugging ALT tags into images, does Google prefer spaces, hyphens, or underscores? I know with filenames, hyphens or underscores are preferred and spaces are replaced with %20. Thoughts? Thanks!
Intermediate & Advanced SEO | | BrooklynCruiser3 -
Paging. is it better to use noindex, follow
Is it better to use the robots meta noindex, follow tag for paging, (page 2, page 3) of Category Pages which lists items within each category or just let Google index these pages Before Panda I was not using noindex because I figured if page 2 is in Google's index then the items on page 2 are more likely to be in Google's index. Also then each item has an internal link So after I got hit by panda, I'm thinking well page 2 has no unique content only a list of links with a short excerpt from each item which can be found on each items page so it's not unique content, maybe that contributed to Panda penalty. So I place the meta tag noindex, follow on every page 2,3 for each category page. Page 1 of each category page has a short introduction so i hope that it is enough to make it "thick" content (is that a word :-)) My visitors don't want long introductions, it hurts bounce rate and time on site. Now I'm wondering if that is common practice and if items on page 2 are less likely to be indexed since they have no internal links from an indexed page Thanks!
Intermediate & Advanced SEO | | donthe0 -
All page files in root? Or to use directories?
We have thousands of pages on our website; news articles, forum topics, download pages... etc - and at present they all reside in the root of the domain /. For example: /aosta-valley-i6816.html
Intermediate & Advanced SEO | | Peter264
/flight-sim-concorde-d1101.html
/what-is-best-addon-t3360.html We are considering moving over to a new URL system where we use directories. For example, the above URLs would be the following: /images/aosta-valley-i6816.html
/downloads/flight-sim-concorde-d1101.html
/forums/what-is-best-addon-t3360.html Would we have any benefit in using directories for SEO purposes? Would our current system perhaps mean too many files in the root / flagging as spammy? Would it be even better to use the following system which removes file endings completely and suggests each page is a directory: /images/aosta-valley/6816/
/downloads/flight-sim-concorde/1101/
/forums/what-is-best-addon/3360/ If so, what would be better: /images/aosta-valley/6816/ or /images/6816/aosta-valley/ Just looking for some clarity to our problem! Thank you for your help guys!0