Protecting sitemaps - Good idea or humbug?
-
Is there a way to protect your sitemap.xml so that only Google can read it and would it make sense to do this?
-
From a hacker's perspective, the first order of business is going to be gathering information on the target. does a hacker or someone with malicious intent gain something in obtaining access to your sitemap?
Yes, they do, and that is more information on the layout of your site. How common would there actually be something on the sitemap that could critically expose you to compromise on your VPS/Shared hosting? Um, probably super ultra rare.
But yes there was one time that I was doing an audit for a company and the sitemap did point to a directory that was vulnerable to directory browsing. Fishing around in the directory, I was able to obtain a picture of a PayPal MasterCard front and back because some idiot snapped pictures of it and uploaded it onto the site.
So there are benefits to hiding it, it's relatively easy to do, but if your lazy and don't want to, chances are your good.
-
Hi Herb,
Thank you for your feedback. I think you are right. We are dealing with very short lived up-to-date information so it is vital that as few sites as possible have the information we have. For this reason I was considering to "hide" our sitemaps. Some of our competitors do that but probably we need to find some other measures to achieve our goal.
Cheers
Thomas -
Hi Thomas;
You have not specified your web server platform, but assuming it is Apache it would be easy to do with a regular expression in your .htaccess
However, I do not see any valid reason for doing so. Your sitemap should be a refection of your public menu and internal public links. So other than making it easier for search and other spiders to crawl your site, it does not expose any information that is not available by other methods. So, best practices say that you should have an accurate site map, and unless you have a reson for hiding it that you did not mention I would not hide it.
I will tell you those that you should not bother putting areas you do not want crawled in your robots.txt file and any of the bad folks will not respect the request.
Take care,
Herb
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
"Url blocked by robots.txt." on my Video Sitemap
I'm getting a warning about "Url blocked by robots.txt." on my video sitemap - but just for youtube videos? Has anyone else encountered this issue, and how did you fix it if so?! Thanks, J
Technical SEO | | Critical_Mass0 -
#1 on Bing, nowhere on Google. Should be at least top 3\. Any ideas?
I have a page that "should" be top 3 on Google - it's optimised (A on the Moz Pro page grader), it's the most relevant result (it's for an e-book, and the page is the publisher's page for the e-book). Other pages on the site for other books are top of the Google SERPs, and this page itself is top in Bing for the search phrase. The page is https://camphorpress.com/books/formosan-odyssey/ and the keyphrase I want to rank for is "formosan odyssey" (with or without the quotes). Does anyone have any insight as to why it's not ranking in Google? Over-optimised? Duplicate content? Many thanks.
Technical SEO | | C-Tech0 -
XML Sitemap Generators
I am looking to use a different sitemap generator that can do 5 thousand or more pages at once. Any recommendations? Thanks guys.
Technical SEO | | Chenzo0 -
Website credits for designers - good or bad
Hi My core service is web design and development. I often place a credit on my clients websites pointing them back to my web design or web development pages. Is this a wise practice with penguin and panda updates? Would this also pull my ranking down?
Technical SEO | | Cocoonfxmedia0 -
Removing Redirected URLs from XML Sitemap
If I'm updating a URL and 301 redirecting the old URL to the new URL, Google recommends I remove the old URL from our XML sitemap and add the new URL. That makes sense. However, can anyone speak to how Google transfers the ranking value (link value) from the old URL to the new URL? My suspicion is this happens outside the sitemap. If Google already has the old URL indexed, the next time it crawls that URL, Googlebot discovers the 301 redirect and that starts the process of URL value transfer. I guess my question revolves around whether removing the old URL (or the timing of the removal) from the sitemap can impact Googlebot's transfer of the old URL value to the new URL.
Technical SEO | | RyanOD0 -
Exclude Child URLs from XML Sitemap Generator (Wordpress)
Hi all, I was recommended the XML Sitemap Generator for Wordpress by the very helpful Keith Bloemendaal and John Pring - however I can't seem to exclude child URLs. There is a section Exclude items and a subsection Exclude posts. I have tried inputting the URLs for the pages I don't want in the sitemap, however that didn't work. So I read that you have to include a list of "IDs" - not sure where on earth to find that info, tried the page name and the post= number from the URL, however neither worked. I hope somebody can point me in the right direction - and apologies, I am a Wordpress novice, and I got no answers from the Wordpress forums so turned right back to SEOmoz! Cheers.
Technical SEO | | markadoi840 -
Sitemap coming up in Google's index?
I apologize if this question's answer is glaringly obvious, but I was using Google to view all the pages it has indexed of our site--by searching for our company and then clicking the link that says to display more results for the site. On page three, it has the sitemap indexed as if it wee just another page of our site. <cite>www.stadriemblems.com/sitemap.xml</cite> Is this supposed to happen?
Technical SEO | | UnderRugSwept0 -
We are still seeing duplicate content on SEOmoz even though we have marked those pages as "noindex, follow." Any ideas why?
We have many pages on our website that have been set to "no index, follow." However, SEOmoz is indexing them as duplicate content. Why is that?
Technical SEO | | cmaseattle0