Development site accidentally crawled - Will this cause problems?
-
We are currently developing a new version of our website and to make it easy to access for all team members, we just set it up on a server accessible via a publicly accessible domain name (ie devsite.com). There has been no SEO and no links created to this site, or so I thought.
Recently, I found out that Google somehow found its way to this development site and has been indexing the pages! I was a little alarmed, as there are no links to the domain and we'll soon be transitioning all the content over to our primary production domain.
I immediately created a robots.txt file to disallow access to the entire development domain. My fear is that there may be some duplicate content penalty if Google sees that the content that is on our new site (once it goes live and is pushed to our REAL domain name) was previously indexed on our test domain.
We're slated to launch in 2-3 weeks. Is there anything else I should do? Should I even be worried? I'm probably a bit paranoid, but given the amount of time and effort that has gone into this new site, I love any advice or thoughts.
Thank You!
-
Great Answer, thanks Phil! One follow-up question:
In my robots.txt for the development site, I have the following:
User-agent: *
Disallow: /
Is this the correct configuration for the robots.txt file to accomplish what I want, that being removing the entire site from being crawled and from the exiting index? Or should I be configuring it differently?
Also, good tip on Webmaster Tools. I'll be request removal there as well.
-
I don't even worry about that anymore. I let Google see me build out a site anyway. I used to worry about that, but not anymore.
"I was a little alarmed, as there are no links to the domain and we'll soon be transitioning all the content over to our primary production domain."
They probably came to the server and hit every site on it.
-
Setting a Robots.txt file for the Dev Site to be No index was a correct response. You can also add a No index no follow meta tag to the Dev site as well.
Another step you can take is to set up a Google Webmaster Tools account for the Dev site and block there as well.
Some dev sites are placed behind a firewall or require a sign on to access, this process can block google as well.
The risks you have is essentially creating an entire duplicate of your current website. Google will always try and crawl everything it can on the net regardless of Noindex tags. No index simply means please dont place in your index. It is important to remember that there are other Search Engines out there besides Google, Bing/yahoo, Ask, Blekko, etc... and all do not automatically honor the Noindex no follow tag. So any secure pages or documents should be just that - secured.
If those pages are no longer in the index, and are not security or confidential in nature I wouldn't worry too much.
- Phil G
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I Use An Animated Javascript Responsive Site
Hi, hope someone might be able to help me with this. I am setting my son up with a website for his small painting and decorating company. However, I am a wordpress stalwart and he has seen a theme which is a javascript animated responsive theme from template monster. Its not my choice just he is adamant that he wants it. However, I am slightly concerned that Google cannot index as well with these kind of sites as they would with a standard HTML site. I would be grateful if someone could confirm if they can be indexed etc. The content appears in what I can only describe as lightboxes. Thanks
Web Design | | denismilton0 -
Could our drop in organic rankings have been caused by improper mobile site set-up?
Site: 12 year old financial service 'information' site with lead gen business model. Historically has held top 10 positions for top keywords and phrases. Background: The organic traffic from Google has fallen to 50% of what it was over the past 4 months compared to the same months last year. While several potential factors could be responsible/contributing (not limited to my pro-active removal of a dozen old emat links that may be perceived as unnatural despite no warning), this drop coincides with the same period the 'mobile site' was launched. Because I admittedly know the least about this potential cause, I am turning to the forum for assistance. Because the site is ~200 pages and contains many 'custom' pages with financial tables, forms, data pulled from 3rd parties, custom/different layouts we opted for creating a mobile site of only the top 12 most popular pages/topics just to have a mobile presence (instead of re-coding the entire site to make it responsive utilizing a mobile css). -These mobile pages were set up in an "m." subdomain. -We used bi-directional tagging placing a rel=canonical tag on the mobile page, and a rel=alternate tag on the desktop page. This created a loop between the pages, as advised by Google. -Some mobile pages used content from a sub page, not the primary desktop page for a particular topic. This may have broken the bi-directional 'loop', meaning the rel=canonical on the mobile page would point to a subpage, where the rel=alternate would point to the primary desktop page, even though the content did not come from that page, necessarily. The primary desktop page is the one that ranks for related keywords. In these cases, the "loop" would be broken. Is this a cause for concern? Could the authority held by the desktop page not be transferred to the mobile version, or the mobile page 'pull away' or disperse the strength of the desktop page if that 'loop' was not connected? Could not setting up the bi-directional tags correctly cause a drop in the organic rankings? -Our developer verified the site is set up according to Google's guidelines for identifying device screen size and serving appropriate version of page. -Are there any tools or utilities that I can use to identify issues, and/or verify everything is configured correctly? -Are we missing anything important in the set-up/configuration? -Could the use of a brand new subdomain 'm.' in and of itself be causing issues? -Have I identified any negative seo practices or pitfalls? Am I missing or overlooking something? While i would have preferred maintaining a single, responsive, site with mobile css, it was not realistic given the various layouts, and owner's desire to only offer the top pages in mobile format. The mobile site may have nothing to do with the organic drop, but I'd like to rule it out if so, and I have so many questions. If anyone could address my concerns, it would be greatly appreciated. Thanks! Greg
Web Design | | seagreen0 -
Is there something fundamentally wrong with our site architecture?
Hi everyone! Could a few of you brilliant people take a look at the architecture of this site http://www.ccisolutions.com, and let me know if you see any obvious problems? I have run the site through XENU, and all of our most important pages, including categories and products, are no deeper than level 3. Everything deeper than that is, in most cases, an image, a pdf or an orphaned page (of which we have thousands). Could having thousands upon thousands of orphaned pages be having a more hurtful effect on our rankings than our site architecture? I have made loud noises and suggested that duplicate content, site speed and dilution of page authority due to all those orphaned pages are some of the primary reasons we don't rank as well as we could. But, I think those suggestions just aren't sexy or dramatic enough, so there is much shaking of heads and discussion that it must be something fundamentally wrong with site architecture. I know re-arranging the furniture is more fun than scrubbing the floors, but I think our problems are more about fundamental cleanup than moving things around What do you think?
Web Design | | danatanseo0 -
Time On Site and SEO?
Does time on site impact rankings? If a person visits your site from the serps or directly visits it by typing in your name in the search field and then leaves within a minute, will that impact your serps? What is the best way to increase time on site?
Web Design | | bronxpad0 -
Will launching this site get my E-commerce site penalized?
Hello.. I am wondering if you guys think launching a site like this is a good or a bad idea. All of the links on it go directly to the exact corresponding page on the ecommerce site. Do you think Google will penalize my site for launching sites (i have many other domains that i will be setting up similar to this) like this? Thanks...
Web Design | | Prime850 -
How SEO friendly (or unfriendly) this site structure is
We have a client who wants a site structure like this http://thethomasoliverband.com/home - try to scroll down on the content and see how the url of the site changes. Would there be any problems on trying to SEO this type of structure?
Web Design | | paulct0 -
Will my site structure provide decent SEO?
We have an ASP.NET MVC website with a view that can dynamically display each product we offer. The product name is hyphenated in the URL, and this is what we’re using to pull the product from the database. So an example URL would be: http://www.mysite.com/Products/Florida/Sample-Product-Name We have another view that dynamically lists the products offered for each state. This page would contain links to the URL for each product offered in that state. The URL for Florida would be: http://www.mysite.com/Products/Florida We want to make sure that when we enter a new product into the database, the product is indexed by Google the next time our site is crawled. I know that Google will crawl through the links in our website, so the new product should get indexed as long as we have a link to it. In this case, the link will be on the view that lists the products for the corresponding state. I have 2 questions: 1) Is my understanding correct that Google will index the product page as long as it can find a link to it somewhere in my site? 3) To get Google to index each URL for content that is generated dynamically from a database, is having links in my site for each URL the only way to do it? Is there something we can do with the site map? Thanks in advance everyone! -Alex
Web Design | | dbuckles0 -
Question about web site structure
Is there an SEO advantage for individual pages to be in sub folders vs not being in a folder? Of course site managemnt is easier with folders if you have 100;s of pages...clearly a shorter URL is easier for humans to naviagte. store.com/gadgets store.com/lasers vs. store.com/gadgets/lasers
Web Design | | johnshearer0