Should sitemap include https pages?
-
Hi guys,
Trying to figure out some onsite issues I've been having. Would appreciate any feedback on the following 2 questions:
My homepage (http://mysite.com) is a 301 redirect to https://mysite.com, which is under SSL. Only 2 pages of my site are https, the rest are http.
-
Should the directory of my sitemap be https://mysite.com/sitemap.xml or should it be kept with http (even though the redirected homepage is to https)?
-
Should my sitemap include the https pages (only 2 pages) as well as the http?
Thanks,
G
-
-
Hi Frederico,
On the google Sitemaps Errors help page, they include the following information:
"You should also check that the URLs all begin with the same domain as your Sitemap location. For instance, if your Sitemap is listed under http://www.example.com/sitemap.xml, the following URLs are not valid for that Sitemap:
http://www.google.com
— it's in the google.com domain rather than the example.com domainhttp://example.com/
— it's missing the initialwww
www.example.com/
— it's missing the protocol (http), and will generate an Invalid URL warninghttps://www.example.com/
— it's using a different protocol (https
rather thanhttp
)
Any URLs in the Sitemap that are not denied are processed normally."
This leads me to understand that Google don't want you to put http urls in an https sitemap and also vice-versa. What makes you believe otherwise??
Hoping to get to the bottom of this - thanks for the ongoing feedback
-
Those suggesting not to add the SSL pages to the HTTP sitemap are using data back from 2007, when indeed Google showed an error on those sitemaps listing both HTTP and HTTPS pages as they were being recognized as different domains. Those days are long gone. Google had evolved and can now handle sitemaps with both HTTP and HTTPS pages just fine.
-
Thanks for the input Frederico. I've been receiving various different answers to this question.
Most responses have said that we should submit 2 sitemaps: 1 sitemap listed under http that only includes the http pages of the site (which means we wouldn't include our homepage since it's under https!!!).
And 1 sitemap listed on the https version which only includes the https pages (which is only 2 pages!).
To be honest, I still don't know what to do here. Really frustrating that there is no clear cut answer to our situation, which I can't believe is even that unique.
-
G,
It wouldn't do any difference to serve the sitemap over HTTP or HTTPS. As for the http and https pages within the same sitemap, it isn't a problem either.
The only reason I can find for creating multiple sitemaps is for HTML pages, images or videos that do require separate sitemaps.
Does you site uses PHP? If yes, I suggest you test xml-sitemaps.com and it will create the full sitemap for you. If you have a dynamic site, then I suggest getting their commercial version. I've been using it for over 7 years I think and I always get a copy for each site I create. And they offer lots of extras in case you need them (news sitemaps, etc).
-
Hey Federico,
Thanks again for the insight - much appreciated.
So there's no problem for us to create a sitemap that has the https homepage and then the rest of the pages in http? From reading previous Q&As on this topic it seems as though people felt you shouldn't have https and http pages under the same sitemap - I am a novice here so that's why I'm just looking for advice.
Is there any reason why we would need to have the two sitemaps available - as in, why wouldn't we just remove the old http sitemap (that didn't include the https homepage) and just go with the https homepage sitemap?
I just wanted to make sure I understood your response before we take action.
Cheers,
-G
-
Hey G!
You can serve your sitemap in both versions, that won't be any problem and won't trigger the duplicate content issue. So you are safe both ways.
As for the second question: Yes, you should, unless you don't want your pages indexed (any HTTP or HTTPS). I think I saw your site before, and if I remember correctly you had your homepage and login script under SSL, right? Then you should definitely include your homepage in the sitemap but you can leave the login script file out as you don't need that indexed nor google will index it either.
Once you have your sitemap ready, consider including a path in the robots file, like this:
User-agent: *
Sitemap: http://[your website address here]/sitemap.xmlHope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
External 404 pages
A client of mine is linking to a third-party vendor from their main site. The page being linked to loads with a Page Not Found error and then replaces some application content once the Javascript kicks in. This process is not visible to users (the application loads fine for front-end users) but it is being picked up as a 404 error in broken link reports. This link is part of the site skin so it's on every page. Outside of the annoyance of having lots of 404 errors being flagged in a broken link report, does this cause any actual issue? Eg, do search enginges see that my client is linking to something that is a 404 error, and does that cause them any harm?
Intermediate & Advanced SEO | | mkleamy0 -
Which is the best option for these pages?
Hi Guys, We have product pages on our site which have duplicate content, the search volume for people searching for these products is very, very small. Also if we add unique content, we could face keyword cannibalisation issues with category/sub-category pages. Now based on proper SEO best practice we should add rel canonical tags from these product pages to the next relevant page. Pros Can rank for product oriented keywords but search volume is very small. Any link equity to these pages passed due to the rel canonical tag would be very small, as these pages barely get any links. Cons Time and effort involved in adding rel canonical tags. Even if we do add rel canonical tags, if Google doesn't deem them relevant then they might ignore causing duplicate content issues. Time and effort involved in making all the content unique - not really worth it - again very minimal searchers. Plus if we do make it unique, then we face keyword cannibalisation issues. -- What do you think would be the optimal solution to this? I'm thinking just implementing a: Across all these product based pages. Keen to hear thoughts? Cheers.
Intermediate & Advanced SEO | | seowork2140 -
Links / Top Pages by Page Authority ==> pages shouldnt be there
I checked my site links and top pages by page authority. What i have found i dont understand, because the first 5-10 pages did not exist!! Should know that we launched a new site and rebuilt the static pages so there are a lot of new pages, and of course we deleted some old ones. I refreshed the sitemap.xml (these pages are not in there) and upload it in GWT. Why those old pages appear under the links menu at top pages by page authority?? How can i get rid off them? thx, Endre
Intermediate & Advanced SEO | | Neckermann0 -
What is the proper way to execute 'page to page redirection'
I need to redirection every page of my website to a new url of another site I've made. I intend to add:"Redirect 301 /oldpage.html http://www.example.com/newpage.html"I will use the 301 per page to redirect every page of my site, but I'm confused that if I add:"Redirect 301 / http://mt-example.com/" it will redirect all of my pages to the homepage and ignore the URLs i have separately mentioned for redirection.Please guide me.
Intermediate & Advanced SEO | | NABSID0 -
I have removed over 2000+ pages but Google still says i have 3000+ pages indexed
Good Afternoon, I run a office equipment website called top4office.co.uk. My predecessor decided that he would make an exact copy of the content on our existing site top4office.com and place it on the top4office.co.uk domain which included over 2k of thin pages. Since coming in i have hired a copywriter who has rewritten all the important content and I have removed over 2k pages of thin pages. I have set up 301's and blocked the thin pages using robots.txt and then used Google's removal tool to remove the pages from the index which was successfully done. But, although they were removed and can now longer be found in Google, when i use site:top4office.co.uk i still have over 3k of indexed pages (Originally i had 3700). Does anyone have any ideas why this is happening and more importantly how i can fix it? Our ranking on this site is woeful in comparison to what it was in 2011. I have a deadline and was wondering how quickly, in your opinion, do you think all these changes will impact my SERPs rankings? Look forward to your responses!
Intermediate & Advanced SEO | | apogeecorp0 -
YouTube Page
Hi All, I am new here but already I can see that SEOmoz is a great place for SEO 🙂 I need advice... We have one client that have 100.000 views per day on their YouTube channel! Now they have about 15.000 per day and ask us what we can do with SEO for their YouTube channel. Thanks for help! All The Best, Sanel
Intermediate & Advanced SEO | | FighterSpirit0 -
Is there a maximum amount of pages that should be added on a sitemap daily?
I started a new music site that has a database of 8,000,000 songs and 500,000+ artists that we are cross referencing with free & legal content sources. Each song essentially has its own page. We are about to start adding links to a sitemap and wanted to find the best practices. Should we add all 8,000,000+ links at once? Should we add a maximum amount a day? Maybe max 5,000? What are the pros and cons of slowly adding the pages or adding them all at once. Any risks? At the rate google is crawling our page it will take 8 years to have all of our songs indexed (It would be very hard to crawl all of our songs as our system is more of an app). I wan't to play it safe and not do anything that will come off as spammy. I have been trying to find some actual evidence on what the best course of action is. Thanks in Advance!
Intermediate & Advanced SEO | | mikecrib10 -
Page Authority Issue
My home page http://www.musicliveuk.com has a domain authority of 42 and page authority of 52. However I have set up other pages on the site to optimise for one keyword per page as I thought this was best practice. For example http://www.musicliveuk.com/home/wedding-bands targets 'wedding band' but this has a page authority of 24 way below my competitors. Having used the keyword difficulty tool on here it appears that is why I am struggling to rank highly (number 9). This is the same problem for several of my main keywords. I am building links to this and other pages in order to increase their authority and eventually rank highly but am I not better off optimising my home page that already has a good page authority and would probably out rank my competitors? Or am I missing something?
Intermediate & Advanced SEO | | SamCUK0