Robots.txt and robots meta

Highland

I have an odd situation. I have a CMS that has a global robots.txt which has the generic

User-Agent: *
Allow: /

I also have one CMS site that needs to not be indexed ever. I've read in various pages (like http://www.jesterwebster.com/robots-txt-vs-meta-tag-which-has-precedence/22 ) that robots.txt always wins over meta, but I have also read that robots.txt indicates spiderability whereas meta can control indexation. I just want the site to not be indexed. Can I leave the robots.txt as is and still put NOINDEX in the robots meta?

TheEspresseo

I see. Have you considered putting it behind an htpasswd?

Highland

I can control it (it's a custom piece of software) but it's not as easy a fix as adding a meta to the template.

The main problem is we have a junk TLD we use to test some new ideas off the live server (lets clients give us feedback) but it gets spidered and indexed and starts ranking for client sites before they're ready to live in their own TLD. This means we have to compete against ourselves (even with a 301). There's nothing sensitive or it would live behind a password.

TheEspresseo

Do you need to control access to the site beyond the SERPS? I would not rely on robots.txt to shield any sensitive data.

For a breakdown of robots.txt and robots meta-tags checkout: http://www.robotstxt.org/robotstxt.html and http://www.searchtools.com/robots/robots-meta.html/, and for a great post on using these standards in SEO check out: http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions

I am also concerned that you are unable to control your robots.txt! If your CMS doesn't let you do that and overwrites it when you change it manually, you have some major control problems on your hands that you should remedy.

fabioricotta-84038

Blocking it at the robots.txt will not guarantee that your site will not appear at Google's index. I think you can use meta robots NOINDEX to guarantee that Google will not show your pages when someone try to Google it.

It is important to say that Googlebot and other spiders will continue to visit your page.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt and robots meta

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Blocking pages from Moz and Alexa robots

Disallow wildcard match in Robots.txt

If you use canonicals do the meta descriptions need to be different?

Product meta tags are not updating in my Magneto website!

Robots.txt

Need for a modified meta-description every page for paginated content?

RegEx help needed for robots.txt potential conflict

Robots.txt for subdomain