Google Indexing Development Site Despite Robots.txt Block
-
Hi,
A development site that has been set-up has the following Robots.txt file:
User-agent: *
Disallow: /
In an attempt to block Google indexing the site, however this isn't the case and the development site has since been indexed.
Any clues why this is or what I could do to resolve it?
Thanks!
-
Hi so I'm assuming your on IIS (I'm no expert on ISS I think you will need to configure the web.config) and I'm just going to step back now and get my coat as I only have experience with Apache
-
Thanks for your help! Much appreciated
-
It's generally best to noindex/nofollow using the meta robots tag in the header. If it's not too much of a stretch for you, you can also password protect the test site. The over-so-lovely and charming Googles will still display results blocked by robots.txt - though it won't generally cache the content. If you would like, you can hookup the test site with Webmaster Tools and remove the URL(s) from the index.
-
Its my understanding that htaccess is PHP based and as we code in .net we don't have a htaccess file.
Do you know of this this happening before because its not something that I've heard of.
-
You would need to block access via htaccess rather than robots file as the robots.txt is only advisory
If you are using wordpress I use this simple plugin JF3 Maintenance Redirect
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Exclude root url in robots.txt ?
Hi, I have the following setup: www.example.com/nl
Technical SEO | | mikehenze
www.example.com/de
www.example.com/uk
etc
www.example.com is 301'ed to www.example.com/nl But now www.example.com is ranking instead of www.example.com/nl
Should is block www.example.com in robots.txt so only the subfolders are being ranked?
Or will i lose my ranking by doing this.0 -
No google traffic for this site? Help?
Hi We have not done this web site http://climateacs.co.uk but have now picked it up and its getting no traffic what so ever from google do you think its been blacklisted? I have added it to my webmaster tools and there are no manual actions on it and most of the backlinks on google webmaster tools are from yell.com. However when I run it on opensiteexplorer I am seeing some chinese type links?? It is not really showing many search queries at all when you view them in webmaster tools under United Kingdom. I was going to start citation building for the address to help support the google places entry but just wanted to see what other peoples opinion was really on this site? Thanks Tracy
Technical SEO | | dashesndots0 -
Staging site and "live" site have both been indexed by Google
While creating a site we forgot to password protect the staging site while it was being built. Now that the site has been moved to the new domain, it has come to my attention that both the staging site (site.staging.com) and the "live" site (site.com) are both being indexed. What is the best way to solve this problem? I was thinking about adding a 301 redirect from the staging site to the live site via HTACCESS. Any recommendations?
Technical SEO | | melen0 -
Quickest way to remove content from Google index?
We had some content on our own website indexed by Google and the content was changed later. But that content is still showing up in Google results. Of course because it was indexed. Its very important for us that content should not show up in Google. So how to remove that content quickly from Google Index? I know normally when it crawl again it will show new content. Google url removal tool or Google url fetch ? or anything else?
Technical SEO | | Personnel_Concept0 -
Is my robots.txt file working?
Greetings from medieval York UK 🙂 Everytime to you enter my name & Liz this page is returned in Google:
Technical SEO | | Nightwing
http://www.davidclick.com/web_page/al_liz.htm But i have the following robots txt file which has been in place a few weeks User-agent: * Disallow: /york_wedding_photographer_advice_pre_wedding_photoshoot.htm Disallow: /york_wedding_photographer_advice.htm Disallow: /york_wedding_photographer_advice_copyright_free_wedding_photography.htm Disallow: /web_page/prices.htm Disallow: /web_page/about_me.htm Disallow: /web_page/thumbnails4.htm Disallow: /web_page/thumbnails.html Disallow: /web_page/al_liz.htm Disallow: /web_page/york_wedding_photographer_advice.htm Allow: / So my question is please... "Why is this page appearing in the SERPS when its blocked in the robots txt file e.g.: Disallow: /web_page/al_liz.htm" ANy insights welcome 🙂0 -
Google not visiting my site
Hi my site www.in2town.co.uk which is a lifestyle magazine has gone under a major refit. I am still working on it but it should be ready by the end of this week or sooner but one problem i have is, google is not visiting the site. I took a huge gamble to redo the site, even though before the refit i was getting a few thousand visitors a day, i wanted to make the site better as i was getting google webmaster errors. But now it seems google is not visiting the site. for example i am using sh404sef and i have put friendly url in the site and on the home page it has its name and meta tag but when you look at google it is not giving the site a name. Also it has not visited the site since october 13th Can anyone advise how to encourage google to visit the site please.
Technical SEO | | ClaireH-1848860 -
Google indexing directory folder listing page
Google somehow managed to find several of our images index folders and decided to include them into their index. Example: websitesite.com/category/images/ is what you'll see when doing a site:website.com search. So, I have two-part question: 1) Does this hurt our site's ability to rank in any way?
Technical SEO | | invision
Because all Google sees is just a directory listing page with a bunch of links to images in the folder. 2) If there could be any negative effect, what is the best way to get these folders out of Google's index?
I could block via robots.txt, but I'm afraid it will also block all the images in that folder from being indexed in Google image search. I could also turn off directory listing in cpanel / htaccess, but then that gives is a 403 forbidden. Will this hurt the site in anyway and would it prevent Google from indexing the images in the directory? Thanks,
Tony0 -
Is blocking RSS Feeds with robots.txt necessary?
Is it necessary to block an rss feed with robots.txt? It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html) And, google says here that it's important not to block RSS feeds (http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html) I'm just checking!
Technical SEO | | nicole.healthline0