Development site accidentally crawled - Will this cause problems?
-
We are currently developing a new version of our website and to make it easy to access for all team members, we just set it up on a server accessible via a publicly accessible domain name (ie devsite.com). There has been no SEO and no links created to this site, or so I thought.
Recently, I found out that Google somehow found its way to this development site and has been indexing the pages! I was a little alarmed, as there are no links to the domain and we'll soon be transitioning all the content over to our primary production domain.
I immediately created a robots.txt file to disallow access to the entire development domain. My fear is that there may be some duplicate content penalty if Google sees that the content that is on our new site (once it goes live and is pushed to our REAL domain name) was previously indexed on our test domain.
We're slated to launch in 2-3 weeks. Is there anything else I should do? Should I even be worried? I'm probably a bit paranoid, but given the amount of time and effort that has gone into this new site, I love any advice or thoughts.
Thank You!
-
Great Answer, thanks Phil! One follow-up question:
In my robots.txt for the development site, I have the following:
User-agent: *
Disallow: /
Is this the correct configuration for the robots.txt file to accomplish what I want, that being removing the entire site from being crawled and from the exiting index? Or should I be configuring it differently?
Also, good tip on Webmaster Tools. I'll be request removal there as well.
-
I don't even worry about that anymore. I let Google see me build out a site anyway. I used to worry about that, but not anymore.
"I was a little alarmed, as there are no links to the domain and we'll soon be transitioning all the content over to our primary production domain."
They probably came to the server and hit every site on it.
-
Setting a Robots.txt file for the Dev Site to be No index was a correct response. You can also add a No index no follow meta tag to the Dev site as well.
Another step you can take is to set up a Google Webmaster Tools account for the Dev site and block there as well.
Some dev sites are placed behind a firewall or require a sign on to access, this process can block google as well.
The risks you have is essentially creating an entire duplicate of your current website. Google will always try and crawl everything it can on the net regardless of Noindex tags. No index simply means please dont place in your index. It is important to remember that there are other Search Engines out there besides Google, Bing/yahoo, Ask, Blekko, etc... and all do not automatically honor the Noindex no follow tag. So any secure pages or documents should be just that - secured.
If those pages are no longer in the index, and are not security or confidential in nature I wouldn't worry too much.
- Phil G
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I'm doing a crawl analysis for a website and finding all these duplicate URLs with "null" being added to them and have no clue what could be causing this.
Does anyone know what could be causing this? Our dev team thinks it's caused by mobile pages they created a while ago but it is adding 1000's of additional URLs to the crawl report and being indexed by Google. They don't see it as a priority but I believe these could be very harmful to our site. examples from URL string:
Web Design | | julianne.amann
uruguay-argentina-chilenullnull/days
rainforests-volcanoes-wildlifenullnull/reviews
of-eastern-europenullnullnullnull/hotels0 -
Specifying image dimensions for site speed vs. responsive
I'm working on improving site speed from an SEO perspective, and one bit of advice I see often is to specify the dimensions of the images you're using so the browser knows the size of the image it needs to download. However, I am wondering what impact this may have if the site is responsive? If you specify the large dimensions suited for a desktop browser, would you be forcing a mobile browser to use that sized image? Has anyone seen dramatic improvements in site speed using the <picture>tag for responsive images?</picture> Thanks! Jannette
Web Design | | JannetteP1 -
Moving to new site. Should I take old blog posts with me?
Our company website has needed a complete overhaul for some time now and the new one is almost ready to go live. We also have a separate "news" site that is houses around 800 blog posts and news items. (That news site will be thrown away because it's on a completely different domain and causes confusion.) So we have a main site with about 100 decent blog posts and a separate news site with 800 poor posts. I plan on bringing all the main site blog posts over to the new site (both WordPress), but my question is whether or not to bring over the news site posts? All, handful, none? Another issue is the news site doesn't have Google Analytics, so I'm not sure if any posts actually generate traffic, but I can from the main site we do get some referrals from it. As far as quality of content goes, it's poor. Not sure who wrote it all, but it's mainly text press releases that aren't very interesting. Is it worth bringing over for SEO purposes or simply delete the site and create a mass redirect so all of those pages will direct to the new website's blog page? Any help is greatly appreciated.
Web Design | | codyfrew0 -
Does interlinking on mobile site helps in seo & improvement in rankings
Hi, Does interlinking on mobile site helps in seo & improvement in rankings. Our desktop site & mobile site has same urls. Regards
Web Design | | vivekrathore0 -
Does Google have problem crawling ssl sites?
We have a site that was ranking well and recently dropped in traffic and ranking. The whole site is https and and not just the shopping pages. Thats the way the server is setup, they make whole site https. My manager thinks the drop in ranking is due to google not crawling https. I think contrary, but would like some feedback on this. Site is here
Web Design | | anthonytjm0 -
Best Way To Have HD Videos On Site That Will Work On Mobile Devices
Hi, I hope someone can help me with this. I am working on a site for a client who works at a video production company. They want to have a fair few HD videos on there site but also for the site and videos to be viewable on mobile devices. I have got a responsive wordpress theme and the site is beginning to take shape. I am wondering however how I can best get the videos to display on mobile devices while maintaining a good load speed. Until now I have been using amazon S3 which stores and feeds the videos and I use Easyvideoplayer to embed the videos. The problem is they do not appear to show up from mobile devices when using wordpress. can anyone suggest the best way for me to still feed the videos from S3 but get them to display on mobile devices. oh, they are private videos so they cannot be placed on youtube.
Web Design | | jensonseo0 -
SEO tricks for a one page site with commented html content
Hi, I am building a website that is very similar to madebysofa.com : means it is one page site with entire content loaded (however are commented in html) and by clicking on sections it modify the DOM to make specific section visible. It is very interesting from UX point of view but as far as I know, since this way most of my content is always commented and hidden from crawlers, I will loose points regarding SEO. Is there any workaround you can recommend or you think sites like madebysofa.com are doomed to loose SEO points by nature? Best regards,
Web Design | | Ashkan10 -
Can SEO Moz perform a full site crawl and provide a report showing all URLs within an existing domain?
We are conducting a site redesign and need to get an idea of all pages that are out there on our domain (in some report fashion). This would help for discovery and cleanup as we re-work the site and move to a new CMS. Thanks
Web Design | | DCondon0