Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should sitemap include https pages?
-
Hi guys,
Trying to figure out some onsite issues I've been having. Would appreciate any feedback on the following 2 questions:
My homepage (http://mysite.com) is a 301 redirect to https://mysite.com, which is under SSL. Only 2 pages of my site are https, the rest are http.
-
Should the directory of my sitemap be https://mysite.com/sitemap.xml or should it be kept with http (even though the redirected homepage is to https)?
-
Should my sitemap include the https pages (only 2 pages) as well as the http?
Thanks,
G
-
-
Hi Frederico,
On the google Sitemaps Errors help page, they include the following information:
"You should also check that the URLs all begin with the same domain as your Sitemap location. For instance, if your Sitemap is listed under http://www.example.com/sitemap.xml, the following URLs are not valid for that Sitemap:
http://www.google.com
— it's in the google.com domain rather than the example.com domainhttp://example.com/
— it's missing the initialwww
www.example.com/
— it's missing the protocol (http), and will generate an Invalid URL warninghttps://www.example.com/
— it's using a different protocol (https
rather thanhttp
)
Any URLs in the Sitemap that are not denied are processed normally."
This leads me to understand that Google don't want you to put http urls in an https sitemap and also vice-versa. What makes you believe otherwise??
Hoping to get to the bottom of this - thanks for the ongoing feedback
-
Those suggesting not to add the SSL pages to the HTTP sitemap are using data back from 2007, when indeed Google showed an error on those sitemaps listing both HTTP and HTTPS pages as they were being recognized as different domains. Those days are long gone. Google had evolved and can now handle sitemaps with both HTTP and HTTPS pages just fine.
-
Thanks for the input Frederico. I've been receiving various different answers to this question.
Most responses have said that we should submit 2 sitemaps: 1 sitemap listed under http that only includes the http pages of the site (which means we wouldn't include our homepage since it's under https!!!).
And 1 sitemap listed on the https version which only includes the https pages (which is only 2 pages!).
To be honest, I still don't know what to do here. Really frustrating that there is no clear cut answer to our situation, which I can't believe is even that unique.
-
G,
It wouldn't do any difference to serve the sitemap over HTTP or HTTPS. As for the http and https pages within the same sitemap, it isn't a problem either.
The only reason I can find for creating multiple sitemaps is for HTML pages, images or videos that do require separate sitemaps.
Does you site uses PHP? If yes, I suggest you test xml-sitemaps.com and it will create the full sitemap for you. If you have a dynamic site, then I suggest getting their commercial version. I've been using it for over 7 years I think and I always get a copy for each site I create. And they offer lots of extras in case you need them (news sitemaps, etc).
-
Hey Federico,
Thanks again for the insight - much appreciated.
So there's no problem for us to create a sitemap that has the https homepage and then the rest of the pages in http? From reading previous Q&As on this topic it seems as though people felt you shouldn't have https and http pages under the same sitemap - I am a novice here so that's why I'm just looking for advice.
Is there any reason why we would need to have the two sitemaps available - as in, why wouldn't we just remove the old http sitemap (that didn't include the https homepage) and just go with the https homepage sitemap?
I just wanted to make sure I understood your response before we take action.
Cheers,
-G
-
Hey G!
You can serve your sitemap in both versions, that won't be any problem and won't trigger the duplicate content issue. So you are safe both ways.
As for the second question: Yes, you should, unless you don't want your pages indexed (any HTTP or HTTPS). I think I saw your site before, and if I remember correctly you had your homepage and login script under SSL, right? Then you should definitely include your homepage in the sitemap but you can leave the login script file out as you don't need that indexed nor google will index it either.
Once you have your sitemap ready, consider including a path in the robots file, like this:
User-agent: *
Sitemap: http://[your website address here]/sitemap.xmlHope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I apply Canonical Links from my Landing Pages to Core Website Pages?
I am working on an SEO project for the website: https://wave.com.au/ There are some core website pages, which we want to target for organic traffic, like this one: https://wave.com.au/doctors/medical-specialties/anaesthetist-jobs/ Then we have basically have another version that is set up as a landing page and used for CPC campaigns. https://wave.com.au/anaesthetists/ Essentially, my question is should I apply canonical links from the landing page versions to the core website pages (especially if I know they are only utilising them for CPC campaigns) so as to push link equity/juice across? Here is the GA data from January 1 - April 30, 2019 (Behavior > Site Content > All Pages😞
Intermediate & Advanced SEO | | Wavelength_International0 -
Few pages without SSL
Hi, A website is not fully secured with a SSL certificate.
Intermediate & Advanced SEO | | AdenaSEO
Approx 97% of the pages on the website are secured. A few pages are unfortunately not secured with a SSL certificate, because otherwise some functions on those pages do not work. It's a website where you can play online games. These games do not work with an SSL connection. Is there anything we have to consider or optimize?
Because, for example when we click on the secure lock icon in the browser, the following notice.
Your connection to this site is not fully secured Can this harm the Google ranking? Regards,
Tom1 -
Spotify XML Sitemap
All, Working on an SEO work up for a Spotify site. Looks like they are using a sitemap that links to additional pages. A problem, none of the links are actually linked within the sitemap. This feels like a strong error. https://lubricitylabs.com/sitemap.xml Thoughts?
Intermediate & Advanced SEO | | dmaher0 -
Should I use https schema markup after http-https migration?
Dear Moz community, Noticed that several groups of websites after HTTP -> HTTPS migration update their schema markup from, example : {
Intermediate & Advanced SEO | | admiral99
"@context": "http://schema.org",
"@type": "WebSite",
"name": "Your WebSite Name",
"alternateName": "An alternative name for your WebSite",
"url": "http://www.your-site.com"
} becomes {
"@context": "https://schema.org",
"@type": "WebSite",
"name": "Your WebSite Name",
"alternateName": "An alternative name for your WebSite",
"url": "https://www.example.com"
} Interesting to know, because Moz website is on https protocol but uses http version of markup. Looking forward for answers 🙂0 -
Can noindexed pages accrue page authority?
My company's site has a large set of pages (tens of thousands) that have very thin or no content. They typically target a single low-competition keyword (and typically rank very well), but the pages have a very high bounce rate and are definitely hurting our domain's overall rankings via Panda (quality ranking). I'm planning on recommending we noindexed these pages temporarily, and reindex each page as resources are able to fill in content. My question is whether an individual page will be able to accrue any page authority for that target term while noindexed. We DO want to rank for all those terms, just not until we have the content to back it up. However, we're in a pretty competitive space up against domains that have been around a lot longer and have higher domain authorities. Like I said, these pages rank well right now, even with thin content. The worry is if we noindex them while we slowly build out content, will our competitors get the edge on those terms (with their subpar but continually available content)? Do you think Google will give us any credit for having had the page all along, just not always indexed?
Intermediate & Advanced SEO | | THandorf0 -
Should my back links go to home page or internal pages
Right now we rank on page 2 for many KWs, so should i now focus my attention on getting links to my home page to build domain authority or continue to direct links to the internal pages for specific KWs? I am about to write some articles for several good ranking sites and want to know whether to link my company name (same as domain name) or KW to the home page or use individual KWs to the internal pages - I am only allowed one link per article to my site. Thanks Ash
Intermediate & Advanced SEO | | AshShep10 -
Date of page first indexed or age of a page?
Hi does anyone know any ways, tools to find when a page was first indexed/cached by Google? I remember a while back, around 2009 i had a firefox plugin which could check this, and gave you a exact date. Maybe this has changed since. I don't remember the plugin. Or any recommendations on finding the age of a page (not domain) for a website? This is for competitor research not my own website. Cheers, Paul
Intermediate & Advanced SEO | | MBASydney0 -
Canonical URLs and Sitemaps
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external). Questions: 1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags? 2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
Intermediate & Advanced SEO | | 379seo0