Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should sitemap include https pages?
-
Hi guys,
Trying to figure out some onsite issues I've been having. Would appreciate any feedback on the following 2 questions:
My homepage (http://mysite.com) is a 301 redirect to https://mysite.com, which is under SSL. Only 2 pages of my site are https, the rest are http.
-
Should the directory of my sitemap be https://mysite.com/sitemap.xml or should it be kept with http (even though the redirected homepage is to https)?
-
Should my sitemap include the https pages (only 2 pages) as well as the http?
Thanks,
G
-
-
Hi Frederico,
On the google Sitemaps Errors help page, they include the following information:
"You should also check that the URLs all begin with the same domain as your Sitemap location. For instance, if your Sitemap is listed under http://www.example.com/sitemap.xml, the following URLs are not valid for that Sitemap:
http://www.google.com— it's in the google.com domain rather than the example.com domainhttp://example.com/— it's missing the initialwwwwww.example.com/— it's missing the protocol (http), and will generate an Invalid URL warninghttps://www.example.com/— it's using a different protocol (httpsrather thanhttp)
Any URLs in the Sitemap that are not denied are processed normally."
This leads me to understand that Google don't want you to put http urls in an https sitemap and also vice-versa. What makes you believe otherwise??
Hoping to get to the bottom of this - thanks for the ongoing feedback
-
Those suggesting not to add the SSL pages to the HTTP sitemap are using data back from 2007, when indeed Google showed an error on those sitemaps listing both HTTP and HTTPS pages as they were being recognized as different domains. Those days are long gone. Google had evolved and can now handle sitemaps with both HTTP and HTTPS pages just fine.
-
Thanks for the input Frederico. I've been receiving various different answers to this question.
Most responses have said that we should submit 2 sitemaps: 1 sitemap listed under http that only includes the http pages of the site (which means we wouldn't include our homepage since it's under https!!!).
And 1 sitemap listed on the https version which only includes the https pages (which is only 2 pages!).
To be honest, I still don't know what to do here. Really frustrating that there is no clear cut answer to our situation, which I can't believe is even that unique.
-
G,
It wouldn't do any difference to serve the sitemap over HTTP or HTTPS. As for the http and https pages within the same sitemap, it isn't a problem either.
The only reason I can find for creating multiple sitemaps is for HTML pages, images or videos that do require separate sitemaps.
Does you site uses PHP? If yes, I suggest you test xml-sitemaps.com and it will create the full sitemap for you. If you have a dynamic site, then I suggest getting their commercial version. I've been using it for over 7 years I think and I always get a copy for each site I create. And they offer lots of extras in case you need them (news sitemaps, etc).
-
Hey Federico,
Thanks again for the insight - much appreciated.
So there's no problem for us to create a sitemap that has the https homepage and then the rest of the pages in http? From reading previous Q&As on this topic it seems as though people felt you shouldn't have https and http pages under the same sitemap - I am a novice here so that's why I'm just looking for advice.
Is there any reason why we would need to have the two sitemaps available - as in, why wouldn't we just remove the old http sitemap (that didn't include the https homepage) and just go with the https homepage sitemap?
I just wanted to make sure I understood your response before we take action.
Cheers,
-G
-
Hey G!
You can serve your sitemap in both versions, that won't be any problem and won't trigger the duplicate content issue. So you are safe both ways.
As for the second question: Yes, you should, unless you don't want your pages indexed (any HTTP or HTTPS). I think I saw your site before, and if I remember correctly you had your homepage and login script under SSL, right? Then you should definitely include your homepage in the sitemap but you can leave the login script file out as you don't need that indexed nor google will index it either.
Once you have your sitemap ready, consider including a path in the robots file, like this:
User-agent: *
Sitemap: http://[your website address here]/sitemap.xmlHope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Images on their own page?
Hi Mozers, We have images on their own separate pages that are then pulled onto content pages. Should the standalone pages be indexable? On the one hand, it seems good to have an image on it's own page, with it's own title. On the other hand, it may be better SEO for crawler to find the image on a content page dedicated to that topic. Unsure. Would appreciate any guidance! Yael
Intermediate & Advanced SEO | | yaelslater1 -
URL structure - Page Path vs No Page Path
We are currently re building our URL structure for eccomerce websites. We have seen a lot of site removing the page path on product pages e.g. https://www.theiconic.co.nz/liberty-beach-blossom-shirt-680193.html versus what would normally be https://www.theiconic.co.nz/womens-clothing-tops/liberty-beach-blossom-shirt-680193.html Should we be removing the site page path for a product page to keep the url shorter or should we keep it? I can see that we would loose the hierarchy juice to a product page but not sure what is the right thing to do.
Intermediate & Advanced SEO | | Ashcastle0 -
Can't generate a sitemap with all my pages
I am trying to generate a site map for my site nationalcurrencyvalues.com but all the tools I have tried don't get all my 70000 html pages... I have found that the one at check-domains.com crawls all my pages but when it writes the xml file most of them are gone... seemingly randomly. I have used this same site before and it worked without a problem. Can anyone help me understand why this is or point me to a utility that will map all of the pages? Kindly, Greg
Intermediate & Advanced SEO | | Banknotes0 -
Merging Pages and SEO
Hi, We are redesigning our website the following way: Before: Page A with Content A, Page B with Content B, Page C with Content C, etc
Intermediate & Advanced SEO | | viatrading1
e.g. one page for each Customer Returns, Overstocks, Master Case, etc
Now: Page D with content A + B + C etc.
e.g. one long page containing all Product Conditions, one after the other So we are merging multiples pages into one.
What is the best way to do so, so we don't lose traffic? (or we lose the minimum possible) e.g. should we 301 Redirect A/B/C to D...?
Is it likely that we lose significant traffic with this change? Thank you,0 -
Substantial difference between Number of Indexed Pages and Sitemap Pages
Hey there, I am doing a website audit at the moment. I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise. Total indexed: 2,360 (Search Consule)
Intermediate & Advanced SEO | | Online-Marketing-Guy
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLs Cheers,
Jochen0 -
Hreflang in vs. sitemap?
Hi all, I decided to identify alternate language pages of my site via sitemap to save our development team some time. I also like the idea of having leaner markup. However, my site has many alternate language and country page variations, so after creating a sitemap that includes mostly tier 1 and tier 2 level URLs, i now have a sitemap file that's 17mb. I did a couple google searches to see is sitemap file size can ever be an issue and found a discussion or two that suggested keeping the size small and a really old article that recommended keeping it < 10mb. Does the sitemap file size matter? GWT has verified the sitemap and appears to be indexing the URLs fine. Are there any particular benefits to specifying alternate versions of a URL in vs. sitemap? Thanks, -Eugene
Intermediate & Advanced SEO | | eugene_bgb0 -
PDF or HTML Page?
One of our sales team members has created a 25 page word document as a topical page. The plan was to make this into an html page with a table of contents. My thoughts were why not make it a pdf? Is there any con to using a PDF vs an html page? If the PDF was properly optimized would it perform just as well? The goal is to have folks click back to our products and hopefully by after reading about how they work.
Intermediate & Advanced SEO | | Sika220 -
Can too many "noindex" pages compared to "index" pages be a problem?
Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio
Intermediate & Advanced SEO | | fablau0