Adding your sitemap to robots.txt
-
Hi everyone,
Best practice question:
When adding your sitemap to your robots.txt file, do you add the whole sitemap at once or do you add different subcategories (products, posts, categories,..) separately?
I'm very curious to hear your thoughts!
-
Just add the sitemap index file to your robots.txt and let them figure it out from there. You basically just want to point them to your sitemaps and they're able to do that from just the sitemap index. So there's not really a need to list all of them in there.
-
From a crawlability point of view, it does not matter. Search engines have no more problems crawling multiple sitemap files than they do crawling one very large XML sitemap file.
An advantage of splitting out your XML sitemaps is that if your site is very large, you are less likely to run into the 50 MB / 50,000 URL limit. If the site is quite small, you obviously won't benefit from this.
If you use multiple sitemaps, you may already know that you don't have to list them all in robots.txt. You can use a sitemap index file to point to your subcategory sitemaps (e.g. posts.xml etc.) Any modifications to the 'child' XML sitemaps do not need to be updated in robots.txt - you only need to remember to add/remove them from the XML index file and Google/Bing Search Console.
Since many site applications automatically generate XML sitemaps grouped by posts, categories and products etc., we find it's easier to use this default configuration - and simply add the sitemap index URL to robots.txt.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search console says 'sitemap is blocked by robots?
Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt." I don't understand why my sitemap is being blocked? My robots.txt look like this: User-Agent: *
Technical SEO | | Extima-Christian
Disallow: Sitemap: http://www.website.com/sitemap_index.xml It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue?1 -
SEO trending down after adding content to website
Hi
Technical SEO | | swat1827
Looking for some guidance. I added about 14 pages of unique content and did all of the on page SEO work using Yoast - have 'good' status on all of them some of the website architecture was changed - mainly on one page. That being said, we got a significant bump the day I implemented, however every day thereafter we have had very bad results. Worse than we had before for about 3 days now. I did resubmit the updated sitemap to GWT and I'm showing no crawl errors. Also, curious if my Robots.txt file could be the issue. All it contains is User-agent: *
Disallow: /wp-admin/ Any insight or advise is greatly appreciated!
Thanks for your time0 -
Mobile sitemap needed for responsive website?
I've seen some older 2012 posts that discuss, but nothing recent given the new changes to emphasize mobile. For website that are already tested and verified as mobile responsive, is best practice to develop a mobile-specific sitemap and submit that as well? Or will any mobile crawlers spider the regular sitemap?
Technical SEO | | Addion0 -
Should Sitemaps be placed in the sub folder they reference?
I have a sitemap-index.xml file in the root. I then have several sitemaps linked to from the index in example.com/sitemaps/sitemap1.xml, example.com/sitemaps/sitemap2.xml, etc. I have seen on other sites that for example a sitemap containing blogs where the blogs are located at example.com/blog/blog1/ would be located at example.com/blog/sitemap.xml. Is it necessary to have the sitemap located in the same folder like this? I would like to have all sitemaps in a single sitemap folder for convenience but not if it will confuse search engines. My index count for URLs in some sitemaps has dropped dramatically in Google Webmaster Tools over the past month or so and I'm not sure if this is having an effect. If it matters, I have all sitemap files, including the index, listed in the robots.txt file.
Technical SEO | | Giovatto0 -
Robots.txt crawling URL's we dont want it to
Hello We run a number of websites and underneath them we have testing websites (sub-domains), on those sites we have robots.txt disallowing everything. When I logged into MOZ this morning I could see the MOZ spider had crawled our test sites even though we have said not to. Does anyone have an ideas how we can stop this happening?
Technical SEO | | ShearingsGroup0 -
Warnings for blocked by blocked by meta-robots/meta robots Nofollow...how to resolve?
Hello, I see hundreds of notices for blocked by meta-robots/meta robots nofollow and it appears it is linked to the comments on my site which I assume I would not want to be crawled. Is this the case and these notices are actually a positive thing? Please advise how to clear them up if these notices can be potentially harmful for my SEO. Thanks, Talia
Technical SEO | | M80Marketing0 -
Adding no follow links on my site
I am getting a warning about having too many links on my page www.accessoriesonline.co.uk (152) but I don't want to remove any links from the site. Its an ecommerce site with categories across the top, featured products and then a further category navigation in the footer. Would it be beneficial if I added a rel="nofollow" to the links in the footer as these are duplicates of the one's in the header or would this harm the links in the header and the destination URL's which I definitely want to be crawled? Also, does anyone know if SEOMOZ considers links with a rel=nofollow as an actually link when they calculate their overview? Thanks in advance
Technical SEO | | gavinhoman0