How ro write a robots txt file to point to your site map
-
Good afternoon from still wet & humid wetherby UK...
I want to write a robots text file that instruct the bots to index everything and give a specific location to the sitemap. The sitemap url is:http://business.leedscityregion.gov.uk/CMSPages/GoogleSiteMap.aspx
Is this correct:
User-agent: *
Disallow:
SITEMAP: http://business.leedscityregion.gov.uk/CMSPages/GoogleSiteMap.aspxAny insight welcome
-
Thank you so much for all your replies
[CASE CLOSED] -
Ryan's answer is correct. I just wanted to jump in to say that I know from first hand experience that Google and Bing are both able to read the sitemap file even if it is a different extension and even if you can't name it sitemap.xml.
-
Yes, your example is correct.
A great page for learning about robots.txt is: http://en.wikipedia.org/wiki/Robots_exclusion_standard#Sitemap
I will share the official method of declaring your sitemap location involves only the first letter being capitalized (i.e. Sitemap not SITEMAP) but I am almost certain it does not make a difference.
A few other suggestions which are best practices but do not have to be followed:
-
use all lowercase letters in URLs
-
name the sitemap file "sitemap" not "GoogleSiteMap"
-
submit XML sitemaps when possible. I am again almost certain Google can read other versions so if all you care about is Google then it's fine but otherwise I would suggest just using xml files.
example: business.leedscityregion.gov.uk/cmspages/sitemap.xml
Some other helpful links:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183668
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's wrong with this robots.txt
Hi. really struggling with the robots.txt file
Technical SEO | | Leonie-Kramer
this is it: User-agent: *
Disallow: /product/ #old sitemap
Disallow: /media/name.xml When testing in w3c.org everything looks good, testing is okay, but when uploading it to the server, Google webmaster tools gives 3 errors. Checked it with my collegue we both don't know what's wrong. Can someone take a look at this and give me the solution.
Thanx in advance! Leonie1 -
Mobile site content and main site content
Help, pls! I have one main site and a mobile version of that site (m.domain.com). The main site has more pages, more content, different named urls. The main site has consistently done well in Google. The mobile site has not: the mobile site is buried. I am working on adding more content to the mobile site, but am concerned about duplicate content. Could someone pls tell me the best way to deal with these two versions of our site? I can't use rel=canonical because the urls do not correspond to the same names on the main site, or can I? Does this mean I need to change the url names, offer different content (abridged), etc? I really am at a loss as to how to interpret Google's rules for this. Could someone please tell me what I am doing wrong? Any help or tips would GREATLY appreciated!!!!! Thanks!
Technical SEO | | lfrazer0 -
Two sites
Hi there just joined had nightmere of a time trying to get a website up and running..... now i have 2 .... one marketing person did and one i did the one i did performing better on google but other onre looks more profetional is there a way i can conbine the 2 under one site..... the one that looks better and getting the benifit of the one thats performing better...... Thanks steve......
Technical SEO | | stevetemple0 -
Robots.txt and 301
Hi Mozzers, Can you answer something for me please. I have a client and they have 301 re-directed the homepage '/' to '/home.aspx'. Therefore all or most of the linkjuice is being passed which is great. They have also marked the '/' as nofollow / noindex in the Robots.txt file so its not being crawled. My question is if the '/' is being denied access to the robots is it still passing on the authority for the links that go into this page? It is a 301 and not 302 so it would work under normal circumstances but as the page is not being crawled do I need to change the Robots.txt to crawl the '/'? Thanks Bush
Technical SEO | | Bush_JSM0 -
What is the sense of robots.txt?
Using robots.txt to prevent search engine from indexing the page is not a good idea. so what is the sense of robots.txt? just for attracting robots to crawl sitemap?
Technical SEO | | jallenyang0 -
Why does my site have a PageRank of 0?
My site (www.onemedical.com) has a PageRank of 0, and I can't figure out why. We did a major site update about a year ago, and moved the site from .md to .com about 9 months ago. We are crawled by Google and rank on the first page for many of our top keywords. We have a MozRank of 4.59. I figured this is something that would just take time to work out of the system, but nothing seems to change while we patiently wait. One more thing to note - when a user comes to the homepage (city selector) and selects their region they will then be cookied and directed to their relevant city site on subsequent visits. But even our city-specific pages (ie www.onemedical.com/sf) have pageranks of 0. My management team keeps asking me about this and I suspect there is something silly that we keep overlooking...but for the life of me, can't figure it out. Any help would be appreciated.
Technical SEO | | OneMedical0 -
Should I set up a disallow in the robots.txt for catalog search results?
When the crawl diagnostics came back for my site its showing around 3,000 pages of duplicate content. Almost all of them are of the catalog search results page. I also did a site search on Google and they have most of the results pages in their index too. I think I should just disallow the bots in the /catalogsearch/ sub folder, but I'm not sure if this will have any negative effect?
Technical SEO | | JordanJudson0 -
Robots.txt File Redirects to Home Page
I've been doing some site analysis for a new SEO client and it has been brought to my attention that their robots.txt file redirects to their homepage. I was wondering: Is there a benfit to setup your robots.txt file to do this? Will this effect how their site will get indexed? Thanks for your response! Kyle Site URL: http://www.radisphere.net/
Technical SEO | | kchandler0