Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Adding multi-language sitemaps to robots.txt
-
I am working on a revamped multi-language site that has moved to Magento. Each language runs off the core coding so there are no sub-directories per language.
The developer has created sitemaps which have been uploaded to their respective GWT accounts. They have placed the sitemaps in new directories such as:
- /sitemap/uk/sitemap.xml
- /sitemap/de/sitemap.xml
I want to add the sitemaps to the robots.txt but can't figure out how to do it. Also should they have placed the sitemaps in a single location with the file identifying each language:
- /sitemap/uk-sitemap.xml
- /sitemap/de-sitemap.xml
What is the cleanest way of handling these sitemaps and can/should I get them on robots.txt?
-
Adding the following lines to the bottom of your robots.txt should do it:
Sitemap: http://www.example.com/sitemap/uk/sitemap.xml
Sitemap: http://www.example.com/sitemap/de/sitemap.xml
If you wanted to update the file names to be different it wouldn't hurt, but I don't think you would have any problems with how they are currently set up. If you have submitted them to WMT and they are being picked up ok I think you are fine.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl solutions for landing pages that don't contain a robots.txt file?
My site (www.nomader.com) is currently built on Instapage, which does not offer the ability to add a robots.txt file. I plan to migrate to a Shopify site in the coming months, but for now the Instapage site is my primary website. In the interim, would you suggest that I manually request a Google crawl through the search console tool? If so, how often? Any other suggestions for countering this Meta Noindex issue?
Technical SEO | | Nomader1 -
How does changing sitemaps affect SEO
Hi all, I have a question regarding changing the size of my sitemaps. Currently I generate sitemaps in batches of 50k. A situation has come up where I need to change that size to 15k in order to be crawled by one of our licensed services. I haven't been able to find any documentation on whether or not changing the size of my sitemaps(but not the pages included in them) will affect my rankings negatively or my SEO efforts in general. If anyone has any insights or has experienced this with their site please let me know!
Technical SEO | | Jason-Reid0 -
Indexing product attributes in sitemap
Hey Mozzers! I'm battling a few questions about the sitemap for my ecommerce store. Could you help me out? Is it necessary to include your product attributes in the sitemap? I'm not sure why it would matter to have a sitemap that lists everything in the color cherry. Also, if the attributes were included in the sitemap, would that count as duplicate content for the same products to show up in multiple attributes? Is there any benefit to submitting the sitemaps individually? For example, submitting /product-sitemap.xml, /product_brand-sitemap.xml versus just /sitemap.xml? Any other best practices for managing my ecommerce sitemap, or great resources, would be very helpful. Thank you! a1vUz
Technical SEO | | localwork0 -
Resubmit sitemaps on every change?
Hello Mozers, Our sitemaps were submitted to Google and Bing, and are successfully indexed. Every time pages are added to our store (ecommerce), we re-generate the xml sitemap. My question is: should we be resubmitting the sitemaps every time their content change, or since they were submitted once can we assume that the crawlers will re-download the sitemaps by themselves (I don't like to assume). What are best practices here? Thanks!
Technical SEO | | yacpro131 -
Blocking Affiliate Links via robots.txt
Hi, I work with a client who has a large affiliate network pointing to their domain which is a large part of their inbound marketing strategy. All of these links point to a subdomain of affiliates.example.com, which then redirects the links through a 301 redirect to the relevant target page for the link. These links have been showing up in Webmaster Tools as top linking domains and also in the latest downloaded links reports. To follow guidelines and ensure that these links aren't counted by Google for either positive or negative impact on the site, we have added a block on the robots.txt of the affiliates.example.com subdomain, blocking search engines from crawling the full subddomain. The robots.txt file is the following code: User-agent: * Disallow: / We have authenticated the subdomain with Google Webmaster Tools and made certain that Google can reach and read the robots.txt file. We know they are being blocked from reading the affiliates subdomain. However, we added this affiliates subdomain block a few weeks ago to the robots.txt, but links are still showing up in the latest downloads report as first being discovered after we added the block. It's been a few weeks already, and we want to make sure that the block was implemented properly and that these links aren't being used to negatively impact the site. Any suggestions or clarification would be helpful - if the subdomain is being blocked for the search engines, why are the search engines following the links and reporting them in the www.example.com subdomain GWMT account as latest links. And if the block is implemented properly, will the total number of links pointing to our site as reported in the links to your site section be reduced, or does this not have an impact on that figure?From a development standpoint, it's a much easier fix for us to adjust the robots.txt file than to change the affiliate linking connection from a 301 to a 302, which is why we decided to go with this option.Any help you can offer will be greatly appreciated.Thanks,Mark
Technical SEO | | Mark_Ginsberg0 -
Is there any value in having a blank robots.txt file?
I've read an audit where the writer recommended creating and uploading a blank robots.txt file, there was no current file in place. Is there any merit in having a blank robots.txt file? What is the minimum you would include in a basic robots.txt file?
Technical SEO | | NicDale0 -
Is it worth adding schema markup to articles?
I know things like location, pagination, breadcrumbs, video, products etc have value in using schema markup. What about things like articles though? Is it worth all the work involved in having the pages mark up automatically? How does this effect SEO, and is it worthwhile? Thanks, Spencer
Technical SEO | | MarloSchneider0 -
Best Practices for adding Dynamic URL's to XML Sitemap
Hi Guys, I'm working on an ecommerce website with all the product pages using dynamic URL's (we also have a few static pages but there is no issue with them). The products are updated on the site every couple of hours (because we sell out or the special offer expires) and as a result I keep seeing heaps of 404 errors in Google Webmaster tools and am trying to avoid this (if possible). I have already created an XML sitemap for the static pages and am now looking at incorporating the dynamic product pages but am not sure what is the best approach. The URL structure for the products are as follows: http://www.xyz.com/products/product1-is-really-cool
Technical SEO | | seekjobs
http://www.xyz.com/products/product2-is-even-cooler
http://www.xyz.com/products/product3-is-the-coolest Here are 2 approaches I was considering: 1. To just include the dynamic product URLS within the same sitemap as the static URLs using just the following http://www.xyz.com/products/ - This is so spiders have access to the folder the products are in and I don't have to create an automated sitemap for all product OR 2. Create a separate automated sitemap that updates when ever a product is updated and include the change frequency to be hourly - This is so spiders always have as close to be up to date sitemap when they crawl the sitemap I look forward to hearing your thoughts, opinions, suggestions and/or previous experiences with this. Thanks heaps, LW0