Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Invisible robots.txt?
- 
					
					
					
					
 So here's a weird one... Client comes to me for some simple changes, turns out there are some major issues with the site, one of which is that none of the correct content pages are showing up in Google, just ancillary (outdated) ones. Looks like an issue because even the main homepage isn't showing up with a "site:domain.com" So, I add to Webmaster Tools and, after an hour or so, I get the red bar of doom, "robots.txt is blocking important pages." I check it out in Webmasters and, sure enough, it's a "User agent: * Disallow /" ACK! But wait... there's no robots.txt to be found on the server. I can go to domain.com/robots.txt and see it but nothing via FTP. I upload a new one and, thankfully, that is now showing but I've never seen that before. Question is: can a robots.txt file be stored in a way that can't be seen? Thanks! 
- 
					
					
					
					
 Hi Josh Did you ever find out how this was happening? 
 I've got the same issue with a wordpress site.. no robots.txt visible in FTP but it is accessible in a browser to view.
- 
					
					
					
					
 I'm seeing the meta tag that's added for the first option: <meta name="robots" content="index, follow" /> ... but I could actually access a file at domain.com/robots.txt that had the content mentioned above. When I logged in via FTP, it wasn't there. I added an actual file there with the correct information and reloaded it to make sure it was showing the correct information. I tested it on my local install and I'm not seeing a robots file being generated. Very odd! 
- 
					
					
					
					
 Yes, you probably answered your own question. In WordPress, there are two different settings under Settings > Privacy: - 
I would like my site visible to everyone, including search engines and archivers. 
- 
I would like to block search engines, but allow normal visitors 
 If option #2 was selected, WordPress doesn't create a robots.txt file for you but instead it automatically generates a tag on every single page. I hope that helps! 
- 
- 
					
					
					
					
 Just make sure you don't set that Privacy setting in a live directory. It takes weeks/months to fully recover. 
- 
					
					
					
					
 This is interesting. I am currently working on the robots.txt and testing it for different purposes. I also thought to do some test with wordpress websites as well so thanks for the update I’ll keep that in mind before actually testing different stuff. Thanks! 
- 
					
					
					
					
 I should mention that this is a WordPress site and, with that, I may have answered my own question. Perhaps WordPress generates a robots.txt dynamically when the setting is active at Settings > Privacy? 
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Disallow wildcard match in Robots.txt
 This is in my robots.txt file, does anyone know what this is supposed to accomplish, it doesn't appear to be blocking URLs with question marks Disallow: /?crawler=1 Technical SEO | | AmandaBridge
 Disallow: /?mobile=1 Thank you0
- 
		
		
		
		
		
		Multiple robots.txt files on server
 Hi! I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step. One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names: robots.txt (original dupplicate) Technical SEO | | mjukhud
 robots.txt-Original (original)
 robots.txt-NEW (other content)
 robots.txt-Working (other content dupplicate) Would really appreciate help and expertise suggestions. Thanks!0
- 
		
		
		
		
		
		Google indexing despite robots.txt block
 Hi This subdomain has about 4'000 URLs indexed in Google, although it's blocked via robots.txt: https://www.google.com/search?safe=off&q=site%3Awww1.swisscom.ch&oq=site%3Awww1.swisscom.ch This has been the case for almost a year now, and it does not look like Google tends to respect the blocking in http://www1.swisscom.ch/robots.txt Any clues why this is or what I could do to resolve it? Thanks! Technical SEO | | zeepartner0
- 
		
		
		
		
		
		Guys & Gals anyone know if urllist.txt is still used?
 I'm using a tool which generates urllist.txt and looking on the SEO Forums it seems that Yahoo used to use this. What I'd like to know is is it still used anywhere and should we have it on the site? Technical SEO | | danwebman0
- 
		
		
		
		
		
		Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?
 I've got several URL's that I need to disallow in my robots.txt file. For example, I've got several documents that I don't want indexed and filters that are getting flagged as duplicate content. Rather than typing in thousands of URL's I was hoping that wildcards were still valid. Technical SEO | | mkhGT0
- 
		
		
		
		
		
		Internal search : rel=canonical vs noindex vs robots.txt
 Hi everyone, I have a website with a lot of internal search results pages indexed. I'm not asking if they should be indexed or not, I know they should not according to Google's guidelines. And they make a bunch of duplicated pages so I want to solve this problem. The thing is, if I noindex them, the site is gonna lose a non-negligible chunk of traffic : nearly 13% according to google analytics !!! I thought of blocking them in robots.txt. This solution would not keep them out of the index. But the pages appearing in GG SERPS would then look empty (no title, no description), thus their CTR would plummet and I would lose a bit of traffic too... The last idea I had was to use a rel=canonical tag pointing to the original search page (that is empty, without results), but it would probably have the same effect as noindexing them, wouldn't it ? (never tried so I'm not sure of this) Of course I did some research on the subject, but each of my finding recommanded one of the 3 methods only ! One even recommanded noindex+robots.txt block which is stupid because the noindex would then be useless... Is there somebody who can tell me which option is the best to keep this traffic ? Thanks a million Technical SEO | | JohannCR0
- 
		
		
		
		
		
		Can I Disallow Faceted Nav URLs - Robots.txt
 I have been disallowing /*? So I know that works without affecting crawling. I am wondering if I can disallow the faceted nav urls. So disallow: /category.html/? /category2.html/? /category3.html/*? To prevent the price faceted url from being cached: /category.html?price=1%2C1000 Technical SEO | | tylerfraser
 and
 /category.html?price=1%2C1000&product_material=88 Thanks!0
- 
		
		
		
		
		
		Should I set up a disallow in the robots.txt for catalog search results?
 When the crawl diagnostics came back for my site its showing around 3,000 pages of duplicate content. Almost all of them are of the catalog search results page. I also did a site search on Google and they have most of the results pages in their index too. I think I should just disallow the bots in the /catalogsearch/ sub folder, but I'm not sure if this will have any negative effect? Technical SEO | | JordanJudson0
 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				