Robots.txt and robots meta
-
I have an odd situation. I have a CMS that has a global robots.txt which has the generic
User-Agent: *
Allow: /I also have one CMS site that needs to not be indexed ever. I've read in various pages (like http://www.jesterwebster.com/robots-txt-vs-meta-tag-which-has-precedence/22 ) that robots.txt always wins over meta, but I have also read that robots.txt indicates spiderability whereas meta can control indexation. I just want the site to not be indexed. Can I leave the robots.txt as is and still put NOINDEX in the robots meta?
-
I see. Have you considered putting it behind an htpasswd?
-
I can control it (it's a custom piece of software) but it's not as easy a fix as adding a meta to the template.
The main problem is we have a junk TLD we use to test some new ideas off the live server (lets clients give us feedback) but it gets spidered and indexed and starts ranking for client sites before they're ready to live in their own TLD. This means we have to compete against ourselves (even with a 301). There's nothing sensitive or it would live behind a password.
-
Do you need to control access to the site beyond the SERPS? I would not rely on robots.txt to shield any sensitive data.
For a breakdown of robots.txt and robots meta-tags checkout: http://www.robotstxt.org/robotstxt.html and http://www.searchtools.com/robots/robots-meta.html/, and for a great post on using these standards in SEO check out: http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions
I am also concerned that you are unable to control your robots.txt! If your CMS doesn't let you do that and overwrites it when you change it manually, you have some major control problems on your hands that you should remedy.
-
Blocking it at the robots.txt will not guarantee that your site will not appear at Google's index. I think you can use meta robots NOINDEX to guarantee that Google will not show your pages when someone try to Google it.
It is important to say that Googlebot and other spiders will continue to visit your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots User-agent Query
Am I correct in saying that the allow/disallow is only applied to msnbot_mobile? mobile robots file User-agent: Googlebot-Mobile User-agent: YahooSeeker/M1A1-R2D2 User-agent: MSNBOT_Mobile Allow: / Disallow: /1 Disallow: /2/ Disallow: /3 Disallow: /4/
Technical SEO | | ThomasHarvey1 -
Issue Missing Meta Description Tag
Hello Friends, Today I found missing meta description tag when Seomoz update my website crawl diagnostics. I recovered other type missing meta description tag but I don't understand how can I recover this type page. Here is the examples. http://www.example.com/blog/page/2/ http://www.example.com/blog/page/3/ http://www.example.com/blog/page/4/ Links continue...... Thanks KLLC
Technical SEO | | KLLC0 -
Duplicate title tags and meta description tags
According to GWT, it seems that some of the pages on my website have duplicate title and meta tags. The pages identified by Google are nothing but dynamic pages: http://www.mywebsite.com/page.php
Technical SEO | | sbrault74
http://www.mywebsite.com/page.php?param=1
http://www.mywebsite.com/page.php?param=2 The thing is that I do use the canonical link tag on all pages. Should I also use the "robots noindex" tag when the page is invoked using a GET parameter? Again sorry for my english. Thank you, Stephane1 -
Do you get credit for an external link that points to a page that's being blocked by robots.txt
Hi folks, No one, including me seems to actually know what happens!? To repeat: If site A links to /home.html on site B and site B blocks /home.html in Robots.txt, does site B get credit for that link? Does the link pass PageRank? Will Google still crawl through it? Does the domain get some juice, but not the page? I know there's other ways of doing this properly, but it is interesting no?
Technical SEO | | DaveSottimano0 -
Wordpress - Missing Meta Description and Title Elements Too Short
I have a wordpress site. When SeoMoz runs the crawl, my report shows 0 errors, but a lot of warnings in two areas: Missing Meta Description and Title Element Too Short. The only thing is, the URL's it shows under both of these categories look like this: http://www.millerhypnosisatlanta.com/author/Michael/page/4/ http://www.millerhypnosisatlanta.com/category/anxiety-panic/ http://www.millerhypnosisatlanta.com/tag/goals/ These are tags, categories, and author pages. How do I hide these from being seen by SeoMoz and Google? I mean I should do that right? Because there is no way I can add a description to these pages, or extend the title - right? Thanks in advance for your help!
Technical SEO | | DallasBonsai0 -
How many of these Meta values should be included in the Head tag?
| | Hi. We receive advice to include so many Meta values in the Head Tag on each page. Which ones are really needed and are really valuable in the SEO effort? |
Technical SEO | | theideapeople
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | Thank you for your help and ideas! Jay0 -
Robots.txt
Hi there, My question relates to the robots.txt file. This statement: /*/trackback Would this block domain.com/trackback and domain.com/fred/trackback ? Peter
Technical SEO | | PeterM220 -
Restricted by robots.txt and soft bounce issues (related).
In our web master tools we have 35K (ish) URLs that are restricted by robots.txt and as have 1200(ish) soft 404s. WE can't seem to figure out how to properly resolve these URLs so that they no longer show up this way. Our traffic from SEO has taken a major hit over the last 2 weeks because of this. Any help? Thanks, Libby
Technical SEO | | GristMarketing0