Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt and Multiple Sitemaps
-
Hello,
I have a hopefully simple question but I wanted to ask to get a "second opinion" on what to do in this situation. I am working on a clients robots.txt and we have multiple sitemaps. Using yoast I have my sitemap_index.xml and I also have a sitemap-image.xml I do put them in google and bing by hand but wanted to have it added into the robots.txt for insurance. So my question is, when having multiple sitemaps called out on a robots.txt file does it matter if one is before the other? From my reading it looks like you can have multiple sitemaps called out, but I wasn't sure the best practice when writing it up in the file.
Example:
User-agent: * Disallow: Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-content/plugins/ Sitemap: http://sitename.com/sitemap_index.xml Sitemap: http://sitename.com/sitemap-image.xml Thanks a ton for the feedback, I really appreciate it! :) J -
Awesome! yea I submitted them to bing and google by hand, I just figured it couldn't hurt to have it in my robots too.
Appreciate the feedback

-
Yes, what you have is the proper format. The best way to submit sitemaps, of course, is to submit them via Google & Bing Webmaster Tools.
Sitemaps won't have much impact on your site unless you have a really large site, so I wouldn't focus on them too much. The best way to get content crawled & indexed by Google is good internal link structure and authoritative external links.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Which Sitemap to keep - Http or https (or both)
Hi, Just finished upgrading my site to the ssl version (like so many other webmasters now that it may be a ranking factor). FIxed all links, CDN links are now secure, etc and 301 Redirected all pages from http to https. Changed property in Google Analytics from http to https and added https version in Webmaster Tools. So far, so good. Now the question is should I add the https version of the sitemap in the new HTTPS site in webmasters or retain the existing http one? Ideally switching over completely to https version by adding a new sitemap would make more sense as the http version of the sitemap would anyways now be re-directed to HTTPS. But the last thing i can is to get penalized for duplicate content. Could you please suggest as I am still a rookie in this department. If I should add the https sitemap version in the new site, should i delete the old http one or no harm retaining it.
Technical SEO | | ashishb010 -
Should all pagination pages be included in sitemaps
How important is it for a sitemap to include all individual urls for the paginated content. Assuming the rel next and prev tags are set up would it be ok to just have the page 1 in the sitemap ?
Technical SEO | | Saijo.George0 -
Block Domain in robots.txt
Hi. We had some URLs that were indexed in Google from a www1-subdomain. We have now disabled the URLs (returning a 404 - for other reasons we cannot do a redirect from www1 to www) and blocked via robots.txt. But the amount of indexed pages keeps increasing (for 2 weeks now). Unfortunately, I cannot install Webmaster Tools for this subdomain to tell Google to back off... Any ideas why this could be and whether it's normal? I can send you more domain infos by personal message if you want to have a look at it.
Technical SEO | | zeepartner0 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
Empty Meta Robots Directive - Harmful?
Hi, We had a coding update and a side-effect of that was that our directive was emptied, in other words it now reads as: on all of the site. I've since noticed that Google's cache date on all of the pages - at least, the ones I tested - have a Cached date of no later than 17 December '12 - that's the Monday after the directive was removed on mass. So, A, does anyone have solid evidence of an empty directive causing problems? Past experience, Matt Cutts, Fishkin quote, etc. And then B - It seems fairly well correlated but, does my entire site's homogenous Cached date point to this tag removal? Or is it fairly normal to have a particular cache date across a large site (we're a large ecommerce site). Our site: http://www.zando.co.za/ I'm having the directive reinstated as soon as Dev permitting. And then, for extra credit, is there a way with Google's API, or perhaps some other tool, to run an arbitrary list and retrieve Cached dates? I'd want to do this for diagnosis purposes and preferably in a way that OK with Google. I'd avoid CURLing for the cached URL and scraping out that dates with BASH, or any such kind of thing. Cheers,
Technical SEO | | RocketZando0 -
Multiple urls for posting multiple classified ads
Want to optimize referral traffic while at same time keep search engines happy and the ads posted. Have a client who advertises on several classified ad sites around the globe. Which is better (post Panda), having multiple identical urls using canonicals to redirect juice to original url? For example: www.bluewidgets.com is the original www.bluewidgetsusa.com www.blue-widgets-galore.com Or, should the duplicate pages be directed to original using a 301? Currently using duplicate urls. Am currently not using "nofollow" tags on those pages.
Technical SEO | | AllIsWell0 -
Removing robots.txt on WordPress site problem
Hi..am a little confused since I ticked the box in WordPress to allow search engines to now crawl my site (previously asked for them not to) but Google webmaster tools is telling me I still have robots.txt blocking them so am unable to submit the sitemap. Checked source code and the robots instruction has gone so a little lost. Any ideas please?
Technical SEO | | Wallander0