Robots.txt and robots meta
-
I have an odd situation. I have a CMS that has a global robots.txt which has the generic
User-Agent: *
Allow: /I also have one CMS site that needs to not be indexed ever. I've read in various pages (like http://www.jesterwebster.com/robots-txt-vs-meta-tag-which-has-precedence/22 ) that robots.txt always wins over meta, but I have also read that robots.txt indicates spiderability whereas meta can control indexation. I just want the site to not be indexed. Can I leave the robots.txt as is and still put NOINDEX in the robots meta?
-
I see. Have you considered putting it behind an htpasswd?
-
I can control it (it's a custom piece of software) but it's not as easy a fix as adding a meta to the template.
The main problem is we have a junk TLD we use to test some new ideas off the live server (lets clients give us feedback) but it gets spidered and indexed and starts ranking for client sites before they're ready to live in their own TLD. This means we have to compete against ourselves (even with a 301). There's nothing sensitive or it would live behind a password.
-
Do you need to control access to the site beyond the SERPS? I would not rely on robots.txt to shield any sensitive data.
For a breakdown of robots.txt and robots meta-tags checkout: http://www.robotstxt.org/robotstxt.html and http://www.searchtools.com/robots/robots-meta.html/, and for a great post on using these standards in SEO check out: http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions
I am also concerned that you are unable to control your robots.txt! If your CMS doesn't let you do that and overwrites it when you change it manually, you have some major control problems on your hands that you should remedy.
-
Blocking it at the robots.txt will not guarantee that your site will not appear at Google's index. I think you can use meta robots NOINDEX to guarantee that Google will not show your pages when someone try to Google it.
It is important to say that Googlebot and other spiders will continue to visit your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt - "File does not appear to be valid"
Good afternoon Mozzers! I've got a weird problem with one of the sites I'm dealing with. For some reason, one of the developers changed the robots.txt file to disavow every site on the page - not a wise move! To rectify this, we uploaded the new robots.txt file to the domain's root as per Webmaster Tool's instructions. The live file is: User-agent: * (http://www.savistobathrooms.co.uk/robots.txt) I've submitted the new file in Webmaster Tools and it's pulling it through correctly in the editor. However, Webmaster Tools is not happy with it, for some reason. I've attached an image of the error. Does anyone have any ideas? I'm managing another site with the exact same robots.txt file and there are no issues. Cheers, Lewis FNcK2YQ
Technical SEO | | PeaSoupDigital0 -
Meta Descriptions
Hi All, Just a quick question regarding Meta Descriptions, I am currently looking at a site where an awful lot of the Meta descriptions are similar (not 100% duplicated). The reason being is that the site contains a lot of the same products, but different weights. For example a 500g, 1kg & 2kg version of the same product. Therefore the Metas are same, apart from the weight that's being discussed. In my opinion the duplication is probably a little too close. Do you think in this circumstance its better to have no Meta descriptions defined at all? Thanks!
Technical SEO | | CarlWint0 -
Very strange: META descriptions not showing
Hello, Since Panda 4.0 has been launched, all of my optimized META description have been gone in Google.
Technical SEO | | MarcelMoz
A while ago, I posted a question about this problem here: http://moz.com/community/q/all-meta-descriptions-gone. I know about Google's own will to decide which META description will be shown. And also about unique content of the descriptions. All pages did have an optimized description before Panda 4.0 and there were no troubles at all, what tells me there is something else going on. I tested some things: Rewrote 50 descriptions to very uinique ones, only five got indexed. This tells me that duplicate content of the descriptions is not the problem (they have never been 100% duplicate, product type was a variable which was always different for each page). Removed cache in GWT and fetched again as Google, didn't help. I checked the pages I tested and they all have been indexed again without showing the optimized descriptions. More information: The first time I changed some META descriptions and fetched the pages again in GWT, Google picked up my new META descriptions and showed them. A few days later, most of them disappeared again (so Google is aware of the description but seems to ignore it). Some pages show the optimized description when I change my search query (only a few, mostly the optimized description never got shown) Technique is ok. Source code shows the right optimized description. META robots isn't blocking anything except NOODP/NOYDIR (always has blocked those). Websites using the exact same CMS, website template, META descriptions (style and build-up), do not have these problems I compared elements like place of description in source code, usage of meta robots, og:description, crawl-delay in robots.txt, and special characters in descriptions between websites that are showing optimized vs. website that don't show optimized descriptions. I can't find any connection. Something I noticed is a change is my Robots.txt file: my webmaster has added the following command:
Crawl-delay: 2 May this have to do with my problem? I guess it doens't. I did some research and there are more websites that are suffering this problem beside mine. This tells me it must be Google (and so Panda 4.0) that is responsible for this change. I realy want my optimized descriptions back. Does anybody have an idea what to do?
Thanks in advance. Marcel0 -
Google insists robots.txt is blocking... but it isn't.
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site. When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section. Bing's webmaster tools are able to read the site and sitemap just fine. Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
Technical SEO | | ahockley0 -
Items 1 to 9 of 22484 total in our Meta Description
Our homepage on our ecommerce site has a category as part of the front page. Unfortunately the top of the category says: Items 1 to 9 of 22484 total Is there a way through GWMT or any other method of stopping Google from adding this on to the front of our Meta Description? Or is the only solution to remove it from the page all together?
Technical SEO | | Benj250 -
Allow or Disallow First in Robots.txt
If I want to override a Disallow directive in robots.txt with an Allow command, do I have the Allow command before or after the Disallow command? example: Allow: /models/ford///page* Disallow: /models////page
Technical SEO | | irvingw0 -
Should I add my blog posts to my sitemap.txt file?
This seems like it should be an obvious no, just because of the amount of work that would entail, and then remembering to do it every time I make a post, but since I couldn't find anything on Google about it and have never heard anyone mention it, I figured I'd ask.
Technical SEO | | UnderRugSwept0 -
For Google + purposes, should the author's name appear in the Meta description or title tag of my web site just as you would your key search phrase?
Relative to Cyrus Shepard's article on January 4th regarding Google's Superior SEO strategy, if I'm the primary author of all blog articles and web site content, and I have a link showing authorship going back to Google Plus, is a site wide link from the home page enough or should that show up on all blog posts etc and editorial comment pages etc? Conversely, should the author's name appear in the Meta description or title tag of my web site just as you would your key search phrase since Google appears to be trying to make a solid connection with my name, and all content?
Technical SEO | | lwnickens0