Hreflang in vs. sitemap?
-
Hi all,
I decided to identify alternate language pages of my site via sitemap to save our development team some time. I also like the idea of having leaner markup.
However, my site has many alternate language and country page variations, so after creating a sitemap that includes mostly tier 1 and tier 2 level URLs, i now have a sitemap file that's 17mb. I did a couple google searches to see is sitemap file size can ever be an issue and found a discussion or two that suggested keeping the size small and a really old article that recommended keeping it < 10mb.
Does the sitemap file size matter? GWT has verified the sitemap and appears to be indexing the URLs fine.
Are there any particular benefits to specifying alternate versions of a URL in vs. sitemap?
Thanks,
-Eugene
-
I have always preferred in the sitemap because it keeps potential lines of code off your page. Everything helps when it comes to page speed.
However, if it's easier for you to put the tags on page, that's is completely valid. Do whatever is easiest to maintain and update.
-
First off if you want to keep the sitemap file, consider breaking it down to multiple files, one for each language/country etc...
Also FYI there are THREE methods, you can also add hreflang to the http header. This might be a good options as well to consider.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question on Indexing, Hreflang tag, Canonical
Dear All, Have a question. We've a client (pharma), who has a prescription medicine approved only in the US, and has only one global site at .com which is accessed by all their target audience all over the world.
Intermediate & Advanced SEO | | jrohwer
For the rest of the US, we can create a replica of the home page (which actually features that drug), minus the existence of the medicine, and set IP filter so that non-US traffic see the duplicate of the home page. Question is, how best to tackle this semi-duplicate page. Possibly no-index won't do because that will block the site from the non-US geography. Hreflang won't work here possibly, because we are not dealing different languages, we are dealing same language (En) but different Geographies. Canonical might be the best way to go? Wanted to have an insight from the experts. Thanks,
Suparno (for Jeff)1 -
XML Sitemap & Bad Code
I've been creating sitemaps with XML Sitemap Generator, and have been downloading them to edit on my pc. The sitemaps work fine when viewing in a browser, but when I download and open in Dreamweaver, the urls don't work when I cut and paste them in the Firefox URL bar. I notice the codes are different. For example, an "&" is produced like this..."&". Extra characters are inserted, producing the error. I was wondering if this is normal, because as I said, the map works fine when viewing online.
Intermediate & Advanced SEO | | alrockn0 -
XML Sitemaps - Multi-lingual website
Hi Mozzers, I am working with a large website that has some of its content translated across multiple languages. I am planning on using The Media Flow to create an HREFLANG Sitemap for content on various languages. Please see the attached image for the questions below. Thanks! Section Highlighted Yellow: When there is a URL that does not have a translated version, should it not be included on the same HREFLANG sitemap? Alternately, could I just remove the languages that are not being targeted, so this would just reflect English language targeting? fqO9Dvk
Intermediate & Advanced SEO | | J-Banz0 -
Total Indexed 1.5M vs 83k submitted by sitemap. What?
We recently took a good look at one of our content site's sitemap and tried to cut out a lot of crap that had gotten in there such as .php, .xml, .htm versions of each page. We also cut out images to put in a separate image sitemap. The sitemap generated 83,000+ URLs for google to crawl (this partially used the Yoast Wordpress plugin to generate) In webmaster tools in the index status section is showing that this site has a total index of 1.5 million. With our sitemap coming back with 83k and google indexing 1.5 million pages, is this a sign of a CMS gone rogue? Is it an indication that we could be pumping out error pages or empty templates, or junk pages that we're cramming into Google's bot? I would love to hear what you guys think. Is this normal? Is this something to be concerned about? Should our total index more closely match our sitemap page count?
Intermediate & Advanced SEO | | seoninjaz0 -
International Image SEO - one host vs multiple hosts
I've got 3 sites (same name) located in Australia, US and UK. Currently these sites are all pulling images (I own) from 1 location. I'd like to create image XML sitemaps for each of these sites. As I see it, my options are: 1. Keeping the images hosted in the 1 place and creating image XML sitemaps for each of the 3 sites (which seems to be technically ok because https://support.google.com/webmasters/answer/178636?hl=en&ref_topic=20986 states that if the image URL isn't on the same domain, both domains need to be verified in Webmaster Tools). However, is there a risk here that the sitemaps will conflict because they are pulling from images on the same host? 2. Hosting the images locally (ie. the same images will be hosted in 3 locations) and applying hreflang in the sitemap. Does anyone know which of these options are best (obviously #1 would be more convenient), or whether there are any other options for attacking this issue? Thanks!
Intermediate & Advanced SEO | | oline1230 -
Is there a way to keep sitemap.xml files from getting indexed?
Wow, I should know the answer to this question. Sitemap.xml files have to be accessible to the bots for indexing they can't be disallowed in robots.txt and can't block the folder at the server level. So how can you allow the bots to crawl these xml pages but have them not show up in google's index when doing a site: command search, or is that even possible? Hmmm
Intermediate & Advanced SEO | | irvingw0 -
301 vs Changing Link href
We have changed our company and want to 301 old domain from new domain in order to transfer the benefits of backlinks (DA: 50, 115 Linking Root Domains). I have the ability to modify around 50% of the backlinks. So my question is: Instead of redirecting all the links, should I update the 50% to link to the new domain instead of relying on redirects? Would this possibly trip an algorithmic filter and devalue these links? Or should I just do a 301 and not worry about modifying the links?
Intermediate & Advanced SEO | | Choice0 -
Controlling PageRank vs flat site architecture
Hey all. Here's the scenario. I have this pretty trusted site with a relatively high PR. The navigation menu has around 300 links. But this is because it is a CSS menu that drills down into subcategories. Now, would restricting the amount of links in this menu be beneficial? I am not worried about any subcategory pages not being crawled or indexed, but I am concerned that subcategory pages will not receive as high of PageRank if they are not linked to directly from the home page, thereby lowering the ranking potential. Even with new pages that are created they receive a PR of 5 if linked to from the home page. But I'm also thinking that toning down the menu size would be beneficial by funneling more PageRank to category pages and increasing the likelihood of ranking for some core head/middle terms. I have seen sites that externalize the menu in JavaScript files and disallow it in Robots.txt to prevent too much PageRank from linking out, but SEO isn't really a one-solution-fits-all in my experience. I may try a test. Externalizing the menu may also increase the relevance for pages because I won't have a bunch of other content on the page not relevant to that page's specific keywords. Anyone with experience in this arena? I would love to hear your input. Thanks
Intermediate & Advanced SEO | | JeremyNelson580