How to implement multilingual sitemaps when not all pages have translations
-
We are trying to implement sitemaps for a site that has localized content for a few countries. We’ve concluded that we should utilize
sitemapindex
and then create one sitemap per country. Now to the problems we’re facing.Not all urls on the site have translations, how should these urls be presented in the sitemap? Should they be stated simply like so?
<url><loc>https://example.com/sdfsdf</loc></url>
So urls with the hreflang attribute and without are mixed in the same sitemap, or is that a problem? (I have added empty rows to make it easier to read)
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" <br="">xmlns:xhtml="http://www.w3.org/1999/xhtml"></urlset>
<url><loc>http://www.example.com/english/page.html</loc>
<xhtml:link rel="alternate" hreflang="de" href="http://www.example.com/deutsch/page.html"><xhtml:link rel="alternate" hreflang="de-ch" href="http://www.example.com/schweiz-deutsch/page.html"><xhtml:link rel="alternate" hreflang="en" href="http: www.example.com="" english="" page.html"=""></xhtml:link rel="alternate" hreflang="en" href="http:></xhtml:link></xhtml:link></url><url><loc>http://www.example.com/page-with-no-translations</loc></url>
<url><loc>http://www.example.com/page-with-no-translations2</loc></url>
<url><loc>http://www.example.com/page-with-no-translations3</loc></url>
<url><loc>http://www.example.com/deutsch/page.html</loc>
<xhtml:link rel="alternate" hreflang="de" href="http://www.example.com/deutsch/page.html"><xhtml:link rel="alternate" hreflang="de-ch" href="http://www.example.com/schweiz-deutsch/page.html"><xhtml:link rel="alternate" hreflang="en" href="http://www.example.com/english/page.html"></xhtml:link rel="alternate"></xhtml:link></xhtml:link></url> -
I continued to think about the matter and thought of this approach for adding the pages with no localization to the multilingual sitemap (just putting it here):
<url><loc>https://example.com/page-with-no-localization</loc></url>
Or is it better to put a simple link to the page in the sitemap?
<url><loc>https://example.com/page-with-no-localization</loc></url>
-
Additional question about which parts should go into which sitemap. The url https://example.com/us/driver-guides/railroad-crossings has 3 "translations". Where should all these <url>elements go? We've used a Google sheet provided by Ahrefs to create these. </url>
This one is for
us
so it should be put in theus
sitemap? (sorry about the code block being single line, I don't understand why they're not multi line)<loc>https://example.com/us/driver-guides/railroad-crossings</loc>
This one is for
uk
so it should be put in theuk
sitemap?<loc>https://example.com/uk/car/level-crossings</loc>
This one is for
se
so it should be put in these
sitemap?This one is the same as the first one. I assume it's for the "x-default" attribute. Where should it be put?
<loc>https://example.com/us/driver-guides/railroad-crossings</loc>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I want to resubmit sitemap
I am doing major changes in my website some of my old url pages i don't want them to be indexed or submitted in site map some of other old pages i want to keep them and there is new pages any one can give me hints what should i do also I have thousands of pages on my website and I don't want to submit all my pages i want to submit best pages to google in sitemap that why i want to resubmit new site maps
Technical SEO | | Jamalon0 -
"One Page With Two Links To Same Page; We Counted The First Link" Is this true?
I read this to day http://searchengineland.com/googles-matt-cutts-one-page-two-links-page-counted-first-link-192718 I thought to myself, yep, thats what I been reading in Moz for years ( pitty Matt could not confirm that still the case for 2014) But reading though the comments Michael Martinez of http://www.seo-theory.com/ pointed out that Mat says "...the last time I checked, was 2009, and back then -- uh, we might, for example, only have selected one of the links from a given page."
Technical SEO | | PaddyDisplays
Which would imply that is does not not mean it always the first link. Michael goes on to say "Back in 2008 when Rand WRONGLY claimed that Google was only counting the first link (I shared results of a test where it passed anchor text from TWO links on the same page)" then goes on to say " In practice the search engine sometimes skipped over links and took anchor text from a second or third link down the page." For me this is significant. I know people that have had "SEO experts" recommend that they should have a blog attached to there e-commence site and post blog posts (with no real interest for readers) with anchor text links to you landing pages. I thought that posting blog post just for anchor text link was a waste of time if you are already linking to the landing page with in a main navigation as google would see that link first. But if Michael is correct then these type of blog posts anchor text link blog posts would have value But who is' right Rand or Michael?0 -
Switchboard Tags - Multiple desktop pages pointing to one mobile page
I have recently started to implement switchboard tags to connect our mobile and desktop pages, and to ensure that our mobile pages show up in rankings for mobile users. Because our desktop site is much deeper in content than our mobile site, there are a number of desktop pages we would like to have point to one mobile page. However, with the switchboard tags, this poses a problem because it requires multiple rel=canonical tags to be placed on the one mobile page. I'm assuming this will either confuse the search engines, or they will choose to ignore the rel=canonical tag altogether. Any ideas on how to approach this situation other than creating an equivalent mobile version of every desktop page or implementing a user agent detection redirect?
Technical SEO | | JBlank0 -
Sitemap and crawl impact
If I have two links in the sitemap (for example: page1.html and page2.html) but the web-site contains more pages (page1.html, page2.html and page3.html) is this a sign for Google to not to crawl other pages? I.e. Will Google index page3.html? Consider that any page can be accessed.
Technical SEO | | ditoroin0 -
Duplicate page content
Hello, The pro dashboard crawler bot thing that you get here reports the mydomain.com and mydomain.com/index.htm as duplicate pages. Is this a problem? If so how do I fix it? Thanks Ian
Technical SEO | | jwdl0 -
Index page
To the SEO experts, this may well seem a silly question, so I apologies in advance as I try not to ask questions that I probably know the answer for already, but clarity is my goal I have numerous sites ,as standard practice, through the .htaccess I will always set up non www to www, and redirect the index page to www.mysite.com. All straight forward, have never questioned this practice, always been advised its the ebst practice to avoid duplicate content. Now, today, I was looking at a CMS service for a customer for their website, the website is already built and its a static website, so the CMS integration was going to mean a full rewrite of the website. Speaking to a friend on another forum, he told me about a service called simple CMS, had a look, looks perfect for the customer ... Went to set it up on the clients site and here is the problem. For the CMS software to work, it MUST access the index page, because my index page is redirected to www.mysite.com , it wont work as it cant find the index page (obviously) I questioned this with the software company, they inform me that it must access the index page, I have explained that it wont be able to and why (cause I have my index page redirected to avoid duplicate content) To my astonishment, the person there told me that duplicate content is a huge no no with Google (that's not the astonishing part) but its not relevant to the index and non index page of a website. This goes against everything I thought I knew ... The person also reassured me that they have worked within the SEO area for 10 years. As I am a subscriber to SEO MOZ and no one here has anything to gain but offering advice, is this true ? Will it not be an issue for duplicate content to show both a index page and non index page ?, will search engines not view this as duplicate content ? Or is this SEO expert talking bull, which I suspect, but cannot be sure. Any advice would be greatly appreciated, it would make my life a lot easier for the customer to use this CMS software, but I would do it at the risk of tarnishing the work they and I have done on their ranking status Many thanks in advance John
Technical SEO | | Johnny4B0 -
Discrepency between # of pages and # of pages indexed
Here is some background: The site in question has approximately 10,000 pages and Google Webmaster shows that 10,000 urls(pages were submitted) 2) Only 5,500 pages appear in the Google index 3) Webmaster shows that approximately 200 pages could not be crawled for various reasons 4) SEOMOZ shows about 1,000 pages that have long URL's or Page Titles (which we are correcting) 5) No other errors are being reported in either Webmaster or SEO MOZ 6) This is a new site launched six weeks ago. Within two weeks of launching, Google had indexed all 10,000 pages and showed 9,800 in the index but over the last few weeks, the number of pages in the index kept dropping until it reached 5,500 where it has been stable for two weeks. Any ideas of what the issue might be? Also, is there a way to download all of the pages that are being included in that index as this might help troubleshoot?
Technical SEO | | Mont0 -
Hundreds of 404 Pages, What Should I do?
Hi, My client just had there website redeveloped within wordpress. I just ran a crawl errors test for their website using Google Webmasters. I discovered that the client has about six hundred, 404 pages. Most of the error pages originated from their previous image gallery. I already have a custom 404 page set-up, but is there something else I should be doing? Is it worth while to 301 redirect every single page within the .htaccess file, or will Google filter these pages out of its index naturally? Thanks Mozers!
Technical SEO | | calindaniel0