Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Adding multi-language sitemaps to robots.txt
-
I am working on a revamped multi-language site that has moved to Magento. Each language runs off the core coding so there are no sub-directories per language.
The developer has created sitemaps which have been uploaded to their respective GWT accounts. They have placed the sitemaps in new directories such as:
- /sitemap/uk/sitemap.xml
- /sitemap/de/sitemap.xml
I want to add the sitemaps to the robots.txt but can't figure out how to do it. Also should they have placed the sitemaps in a single location with the file identifying each language:
- /sitemap/uk-sitemap.xml
- /sitemap/de-sitemap.xml
What is the cleanest way of handling these sitemaps and can/should I get them on robots.txt?
-
Adding the following lines to the bottom of your robots.txt should do it:
Sitemap: http://www.example.com/sitemap/uk/sitemap.xml
Sitemap: http://www.example.com/sitemap/de/sitemap.xml
If you wanted to update the file names to be different it wouldn't hurt, but I don't think you would have any problems with how they are currently set up. If you have submitted them to WMT and they are being picked up ok I think you are fine.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How does changing sitemaps affect SEO
Hi all, I have a question regarding changing the size of my sitemaps. Currently I generate sitemaps in batches of 50k. A situation has come up where I need to change that size to 15k in order to be crawled by one of our licensed services. I haven't been able to find any documentation on whether or not changing the size of my sitemaps(but not the pages included in them) will affect my rankings negatively or my SEO efforts in general. If anyone has any insights or has experienced this with their site please let me know!
Technical SEO | | Jason-Reid0 -
Robots.txt in subfolders and hreflang issues
A client recently rolled out their UK business to the US. They decided to deploy with 2 WordPress installations: UK site - https://www.clientname.com/uk/ - robots.txt location: UK site - https://www.clientname.com/uk/robots.txt
Technical SEO | | lauralou82
US site - https://www.clientname.com/us/ - robots.txt location: UK site - https://www.clientname.com/us/robots.txt We've had various issues with /us/ pages being indexed in Google UK, and /uk/ pages being indexed in Google US. They have the following hreflang tags across all pages: We changed the x-default page to .com 2 weeks ago (we've tried both /uk/ and /us/ previously). Search Console says there are no hreflang tags at all. Additionally, we have a robots.txt file on each site which has a link to the corresponding sitemap files, but when viewing the robots.txt tester on Search Console, each property shows the robots.txt file for https://www.clientname.com only, even though when you actually navigate to this URL (https://www.clientname.com/robots.txt) you’ll get redirected to either https://www.clientname.com/uk/robots.txt or https://www.clientname.com/us/robots.txt depending on your location. Any suggestions how we can remove UK listings from Google US and vice versa?0 -
Robot.txt : How to block a specific file type in several subdirectories ?
Hello everyone ! I need help setting up a robot.txt. I'm trying to block all pdf files in particular directories so I'm using this command. In the example below the line is blocking all .gif in the entire site. Block files of a specific file type (for example, .gif) | Disallow: /*.gif$ 2 questions : Can I use this command to specify one particular directory in which I want to block pdf files ? Will this line be recognized by googlebots ? Disallow: /fileadmin/xxxxxxx/xxx/xxxxxxx/*.pdf$ Then I realized that I would have to write as many lines as many directories there are in which I want to block pdf files. Let's say I want to block pdf files in all these 3 directories /fileadmin/directory1 /fileadmin/directory1/sub1 /fileadmin/directory1/sub1/pdf Is there a pattern-matching rule I could use to blocks access to pdf files in all subdirectories instead of writing 3x the above line for each subdirectory ? For exemple : Disallow: /fileadmin/directory1*/ Many thanks in advance for any insight you may have.
Technical SEO | | LabeliumUSA0 -
Image Sitemap
I currently use a program to create our sitemap (xml). It doesn't offer creating an mage sitemaps. Can someone suggest a program that would create an image sitemap? Thanks.
Technical SEO | | Kdruckenbrod0 -
Sitelinks Issue - Different Languages
Hey folks, We run different ccTLD's for revolveclothing.com (revolveclothing.es, revolveclothing.com.br, etc. etc.) and they all have their own WMT/Google Console with their own href lang tags etc. The problem is this. https://www.google.fr/#q=revolve+clothing When you look at the sitelinks, you'll see that one of them (sales page) happens to be in Portuguese on the French site. Can anyone investigate and see why?
Technical SEO | | ggpaul5620 -
Sitemap indexed pages dropping
About a month ago I noticed my pages indexed from my sitemap are dropping.There are 134 pages in my sitemap and only 11 are indexed. It used to be 117 pages and just died off quickly. I still seem to be getting consistant search traffic but I'm just not sure whats causing this. There are no warnings or manual actions required in GWT that I can find.
Technical SEO | | zenstorageunits0 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
/index.php in sitemap? take it out?
Hi Everyone, The following was automatically generated at xml-sitemaps.com Should I get rid of the index.php url from my sitemap? If so, how do I go about redirecting it in my htaccess ? <url><loc>http://www.mydomain.ca/</loc></url>
Technical SEO | | RogersSEO
<url><loc>http://www.mydomain.ca/index.php</loc></url> thank you in advance, Martin0