Robots.txt and robots meta

Highland

I have an odd situation. I have a CMS that has a global robots.txt which has the generic

User-Agent: *
Allow: /

I also have one CMS site that needs to not be indexed ever. I've read in various pages (like http://www.jesterwebster.com/robots-txt-vs-meta-tag-which-has-precedence/22 ) that robots.txt always wins over meta, but I have also read that robots.txt indicates spiderability whereas meta can control indexation. I just want the site to not be indexed. Can I leave the robots.txt as is and still put NOINDEX in the robots meta?

TheEspresseo

I see. Have you considered putting it behind an htpasswd?

Highland

I can control it (it's a custom piece of software) but it's not as easy a fix as adding a meta to the template.

The main problem is we have a junk TLD we use to test some new ideas off the live server (lets clients give us feedback) but it gets spidered and indexed and starts ranking for client sites before they're ready to live in their own TLD. This means we have to compete against ourselves (even with a 301). There's nothing sensitive or it would live behind a password.

TheEspresseo

Do you need to control access to the site beyond the SERPS? I would not rely on robots.txt to shield any sensitive data.

For a breakdown of robots.txt and robots meta-tags checkout: http://www.robotstxt.org/robotstxt.html and http://www.searchtools.com/robots/robots-meta.html/, and for a great post on using these standards in SEO check out: http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions

I am also concerned that you are unable to control your robots.txt! If your CMS doesn't let you do that and overwrites it when you change it manually, you have some major control problems on your hands that you should remedy.

fabioricotta-84038

Blocking it at the robots.txt will not guarantee that your site will not appear at Google's index. I think you can use meta robots NOINDEX to guarantee that Google will not show your pages when someone try to Google it.

It is important to say that Googlebot and other spiders will continue to visit your page.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt and robots meta

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Log in, sign up, user registration and robots

How to use robots.txt to block areas on page?

A few misc Webmaster tools questions & Robots.txt etc

Googlebot does not obey robots.txt disallow

We just fixed a Meta refresh, unified our link profile and now our rankings are going crazy

Adding 'NoIndex Meta' to Prestashop Module & Search pages.

Can I Disallow Faceted Nav URLs - Robots.txt

Robots.txt