Google Indexing Development Site Despite Robots.txt Block
-
Hi,
A development site that has been set-up has the following Robots.txt file:
User-agent: *
Disallow: /
In an attempt to block Google indexing the site, however this isn't the case and the development site has since been indexed.
Any clues why this is or what I could do to resolve it?
Thanks!
-
Hi so I'm assuming your on IIS (I'm no expert on ISS I think you will need to configure the web.config) and I'm just going to step back now and get my coat as I only have experience with Apache
-
Thanks for your help! Much appreciated
-
It's generally best to noindex/nofollow using the meta robots tag in the header. If it's not too much of a stretch for you, you can also password protect the test site. The over-so-lovely and charming Googles will still display results blocked by robots.txt - though it won't generally cache the content. If you would like, you can hookup the test site with Webmaster Tools and remove the URL(s) from the index.
-
Its my understanding that htaccess is PHP based and as we code in .net we don't have a htaccess file.
Do you know of this this happening before because its not something that I've heard of.
-
You would need to block access via htaccess rather than robots file as the robots.txt is only advisory
If you are using wordpress I use this simple plugin JF3 Maintenance Redirect
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google search console image indexing issue
Google search console tells that only '58 out of the 3553' images in the images sitemap are indexed. But if I search "site:example.com" in Google images there seem to be lots of images. There are no errors in the sitemap and I am still getting reasonable number of image search hits daily. Are the webmaster tools stats for images indexed accurate? When I click on the Sitemap Errors & Index Errors this is what i get - Error details: No errors found. https://www.screencast.com/t/pqL62pIc
Technical SEO | | 21centuryweb0 -
301'd site, but new site is not getting picked up in google.
Hi I'm having big issues! Any help would be greatly appreciated This is the 3rd time this happened. Every time I switch my old site greatcleanjokes.com to the new design of chokeonajoke.com traffic goes almost completely down (I even tried out the new design on greatcleanjokes [to see if it was a 301 issue] and traffic also went down.) What can possibly be wrong with this new site that google just doesn't like it ?! I was ranking high up for many big phrase like joke of the day, corny jokes, clean jokes, short jokes. Now It's all gone. I also think it's strange that when I search for site:chokeonajoke.com the post pages show up before the category pages!? Here is the old site http://web.archive.org/web/20140406214615/http://www.greatcleanjokes.com/ Here is the new one http://chokeonajoke.com/ If you can't figure out anything do you know of anyone I can hire who may be able to figure it out?
Technical SEO | | Nickys22111 -
Blocked by robots
my client GWT has a number of notices for "blocked by meta-robots" - these are all either blog posts/categories/or tags his former seo told him this: "We've activated following settings: Use noindex for Categories Use noindex for Archives Use noindex for Tag Archives to reduce keyword stuffing & duplicate post tags
Technical SEO | | Ezpro9
Disabling all 3 noindex settings above may remove google blocks but also will send too many similar tags, post archives/category. " is this guy correct? what would be the problem with indexing these? am i correct in thinking they should be indexed? thanks0 -
Quickest way to remove content from Google index?
We had some content on our own website indexed by Google and the content was changed later. But that content is still showing up in Google results. Of course because it was indexed. Its very important for us that content should not show up in Google. So how to remove that content quickly from Google Index? I know normally when it crawl again it will show new content. Google url removal tool or Google url fetch ? or anything else?
Technical SEO | | Personnel_Concept0 -
Robots.txt versus sitemap
Hi everyone, Lets say we have a robots.txt that disallows specific folders on our website, but a sitemap submitted in Google Webmaster Tools that lists content in those folders. Who wins? Will the sitemap content get indexed even if it's blocked by robots.txt? I know content that is blocked by robot.txt can still get indexed and display a URL if Google discovers it via a link so I'm wondering if that would happen in this scenario too. Thanks!
Technical SEO | | anthematic0 -
Google is indexing proxy (mirror) site.
We moved the site to a new hosting. Previously the site used Godaddy Windows Hosting with white domain masking. After moving the site we just mirrored the site. We have to use mirrored domain for PPC campaigns because it mirrored site contains true BRAND name and there is better conversion with that domain plus all trade marked keywords are approved for mirrored domain. Robots.txt User-agent: * Host: www.hermitagejewelers.com Disallow: /Bin Disallow: /css www.hermitagejewelers.com is the main domain. Mirror site is www.ermitagejewelers.com (Without the "H" at the beginning) Most of the keywords are now picked up by mirror site. I have not noticed any major changes in ranking except that it ranks for mirror site. We updated the sitemap. Website is designed very poorly (not by us). Also, we submitted the change address request for ermitagejewelers to hermitagejewelers in webmasters. Please let me know any advice to fix that problem. Thank you.
Technical SEO | | MaxRuso1 -
My site cannot be found by google at all
I don't know why but our company site can not be found by google at all. I have submitted to google webmaster, have social media point to, etc, Is there any reason for this? url for our website is www.bistosamerica.com Thank you
Technical SEO | | BistosAmerica0 -
Pages not Indexed after a successful Google Fetch
I am trying to understand why google isn't indexing key content on my site. www.BeyondTransition.com is indexed and new pages show up in a couple of hours. My key content is 6 pages of information for each of 3000 events (driven by mySQL on a wordpress platform). These pages are reached via a search page, but no direct navigation from the home page. When I link to an event page from an indexed page it doesn't show up in search results. When I use fetch on webmaster tools the fetch is successful but is then not indexed - or if it does appear in results it's directed to the internal search page e.g. http://www.beyondtransition.com/site/races/course/race110003/ has been fetched and submitted with links but when I search for BeyondTransition Ironman Cozumel I get these results.... So what have I done wrong and how do I go about fixing it? All thoughts and advice appreciated Thanks Denis
Technical SEO | | beyondtransition0