Site being indexed by Google before it has launched

Sayers

We are currently coming towards the end of a site migration, and are at the final stage of testing redirects etc. However, to our horror we've just discovered Google has started indexing the new site. Any ideas on how this could have happened? I have most recently asked for robots.txt to exclude anything with a certain parameter in URL. Is there a chance this, wrongly implemented, could have caused this?

KeriMorgret

Duplicate question, closing this question so all answers can be given at http://www.seomoz.org/q/site-being-indexed-by-google-before-it-has-launched-2

HiveDigitalInc

Many ways - Google discovers URLs through a large number of methods, although primarily through links. I have seen some pretty amazing ways of discovery though...

Links posted in emails where the emails ended up on the web (like a private newsletter with a public archive)
Links showing up in click stream data services like alexa
Links showing up from "recently registered" domain lists

The rule of thumb is always ALWAYS start with a robots.txt. It is the first thing you should do when setting up a dev environment.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Site being indexed by Google before it has launched

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Webpages & Images Index Graph Gone Down Badly in Google Search Console Why?

Some of my website urls are not getting indexed while checking (site: domain) in google

Site indexed by Google, but (almost) never gets impressions

Test site got indexed in Google - What's the best way of getting the pages removed from the SERP's?

Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?

How does Google find /feed/ at the end of all pages on my site?

How do we ensure our new dynamic site gets indexed?

Some sites like bbc.co.uk place the most important category links at the bottom of the page while other sites will place the whole site map there. What are the benefits (or not) of both approaches?