Development site accidentally crawled - Will this cause problems?
-
We are currently developing a new version of our website and to make it easy to access for all team members, we just set it up on a server accessible via a publicly accessible domain name (ie devsite.com). There has been no SEO and no links created to this site, or so I thought.
Recently, I found out that Google somehow found its way to this development site and has been indexing the pages! I was a little alarmed, as there are no links to the domain and we'll soon be transitioning all the content over to our primary production domain.
I immediately created a robots.txt file to disallow access to the entire development domain. My fear is that there may be some duplicate content penalty if Google sees that the content that is on our new site (once it goes live and is pushed to our REAL domain name) was previously indexed on our test domain.
We're slated to launch in 2-3 weeks. Is there anything else I should do? Should I even be worried? I'm probably a bit paranoid, but given the amount of time and effort that has gone into this new site, I love any advice or thoughts.
Thank You!
-
Great Answer, thanks Phil! One follow-up question:
In my robots.txt for the development site, I have the following:
User-agent: *
Disallow: /
Is this the correct configuration for the robots.txt file to accomplish what I want, that being removing the entire site from being crawled and from the exiting index? Or should I be configuring it differently?
Also, good tip on Webmaster Tools. I'll be request removal there as well.
-
I don't even worry about that anymore. I let Google see me build out a site anyway. I used to worry about that, but not anymore.
"I was a little alarmed, as there are no links to the domain and we'll soon be transitioning all the content over to our primary production domain."
They probably came to the server and hit every site on it.
-
Setting a Robots.txt file for the Dev Site to be No index was a correct response. You can also add a No index no follow meta tag to the Dev site as well.
Another step you can take is to set up a Google Webmaster Tools account for the Dev site and block there as well.
Some dev sites are placed behind a firewall or require a sign on to access, this process can block google as well.
The risks you have is essentially creating an entire duplicate of your current website. Google will always try and crawl everything it can on the net regardless of Noindex tags. No index simply means please dont place in your index. It is important to remember that there are other Search Engines out there besides Google, Bing/yahoo, Ask, Blekko, etc... and all do not automatically honor the Noindex no follow tag. So any secure pages or documents should be just that - secured.
If those pages are no longer in the index, and are not security or confidential in nature I wouldn't worry too much.
- Phil G
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there any proof that google can crawl PWA's correctly, yet
At the end of 2018 we rolled out our agency website as a PWA. At the time, Google used Chrome (41) headless to render our website. Although all sources announced at the time that it 'should work', we experienced the opposite. As a solution we implement the option for server side rendering, so that we did not experience any negative effects. We are over a year later. Does anyone have 'evidence' that Google can actually render and correctly interpret client side PWA's?
Web Design | | Erwin000 -
I am Using <noscript>in All Webpage and google not Crawl my site automatically any solution</noscript>
| |
Web Design | | ahtisham2018
| | <noscript></span></td> </tr> <tr> <td class="line-number"> </td> <td class="line-content"><meta http-equiv="refresh" content="0;url=errorPages/content-blocked.jsp?reason=js"></td> </tr> <tr> <td class="line-number"> </td> <td class="line-content"><span class="html-tag"></noscript> | and Please tell me effect on seo or not1 -
Site Migration due to Corporate Acquisition
Hey everyone, Wanted to check-in on something that I've been thinking way too much about lately. I'll do my best to provide background, but due to some poor planning, it is rather confusing to wrap your head around. There are currently three companies involved, Holding Corp (H Corp) and two operating companies, both in the same vertical but one B2B and the other is B2C. B2C corp has been pushed down the line and we're focusing primarily on H Corp and B2B brand. Due to an acquisition of H Corp and all of it's holdings, things are getting shuffled and Ive been brought in to ensure things are done correctly. What's bizarre is H Corp and it's web property are the dominant authority in SERPs for the B2B brand. As in B2B brand loses on brand searches to H Corp, let alone any product/service related terms. As such, they want to effectively migrate all related content from H Corp site to B2B brand site and handover authority as effectively as possible. Summary: Domain Migration from H Corp site to B2B Brand site. Ive done a few migrations in my past and been brought in to recover a few post-launch so I have decent experience and a trusted process. One of my primary objectives initially is change as little as possible with content, url structure (outside the root) etc so 301s are easy but also so it doesn't look like we're trying to play any games. Here's the thing, the URL structure for H Corp is downright bad from both a UX perspective and a general organizational perspective. So Im feeling conflicted and wanted to get a few other opinions. Here are my two paths as I see and Id love opinions on both: stick with a similar URL structure to H Corp through the migration (my normal process) but deviate from pretty much every best practice for structuring URLs with keywords, common sense and logic. Pro: follow my process (which has always worked in the past) Con: don't implement SEO/On-page best practices at this stage and wait for the site redesign to implement best practices (more work) Implement new URL structure now and deviate from my trusted process. Do you see a third option? Am I overthinking it? Other important details: B2B brand is under-going a site redesign, mostly aesthetic but their a big corporation and will likely take 6-9 months to get up. Any input greatly appreciated. Cheers, Brent
Web Design | | pastcatch1 -
Changing top level navigation between site sections
We've got an internal proposal to change our top level nav depending on the section of the site. For example, on our homepage it might read: Products, Library, About with relevant links dropping down below. As we have varied products, the drop down underneath it would include the various families. When arriving on the product family page the top-level nav would change to represent more specific offerings. For example: xxx.com 1. Products; 2. Library; 3. About xxx.com/xxx 1. Product family 1; 2. Product family 2; 3. Product family 3; 4. Library; 5. About What are the SEO/UX implications of this? It seems confusing but allows more specific navigation via the main nav depending on the section of the site. Also it seems that an alternating TLN might not be too Google-friendly.
Web Design | | gwelch0 -
Lots of Listing Pages with Thin Content on Real Estate Web Site-Best to Set them to No-Index?
Greetings Moz Community: As a commercial real estate broker in Manhattan I run a web site with over 600 pages. Basically the pages are organized in the following categories: 1. Neighborhoods (Example:http://www.nyc-officespace-leader.com/neighborhoods/midtown-manhattan) 25 PAGES Low bounce rate 2. Types of Space (Example:http://www.nyc-officespace-leader.com/commercial-space/loft-space)
Web Design | | Kingalan1
15 PAGES Low bounce rate. 3. Blog (Example:http://www.nyc-officespace-leader.com/blog/how-long-does-leasing-process-take
30 PAGES Medium/high bounce rate 4. Services (Example:http://www.nyc-officespace-leader.com/brokerage-services/relocate-to-new-office-space) High bounce rate
3 PAGES 5. About Us (Example:http://www.nyc-officespace-leader.com/about-us/what-we-do
4 PAGES High bounce rate 6. Listings (Example:http://www.nyc-officespace-leader.com/listings/305-fifth-avenue-office-suite-1340sf)
300 PAGES High bounce rate (65%), thin content 7. Buildings (Example:http://www.nyc-officespace-leader.com/928-broadway
300 PAGES Very high bounce rate (exceeding 75%) Most of the listing pages do not have more than 100 words. My SEO firm is advising me to set them "No-Index, Follow". They believe the thin content could be hurting me. Is this an acceptable strategy? I am concerned that when Google detects 300 pages set to "No-Follow" they could interpret this as the site seeking to hide something and penalize us. Also, the building pages have a low click thru rate. Would it make sense to set them to "No-Follow" as well? Basically, would it increase authority in Google's eyes if we set pages that have thin content and/or low click thru rates to "No-Follow"? Any harm in doing this for about half the pages on the site? I might add that while I don't suffer from any manual penalty volume has gone down substantially in the last month. We upgraded the site in early June and somehow 175 pages were submitted to Google that should not have been indexed. A removal request has been made for those pages. Prior to that we were hit by Panda in April 2012 with search volume dropping from about 7,000 per month to 3,000 per month. Volume had increased back to 4,500 by April this year only to start tanking again. It was down to 3,600 in June. About 30 toxic links were removed in late April and a disavow file was submitted with Google in late April for removal of links from 80 toxic domains. Thanks in advance for your responses!! Alan0 -
Building a Mobile Site: Tools?
I've been tasked with re-building our company's mobile site and honestly have zero experience doing so. I know my way around HTML pretty well and have built several websites but never for mobile. Does anybody have any recommendations for me as far as tools to use to construct a proper mobile site? I basically want a simple page with four buttons on the front and a little drop down menu in the top corner. (not that this matters terribly but just saying, shouldn't need to be overly complicated.) Thanks in advance!
Web Design | | jesse-landry0 -
Site Doing Horrible After Redesign
Hello Fellow Forum Members: Thank you all for taking the time to read this. This is in follow up to one of my previous questions, but I now have more information. I will try to be as concise as possible and want to sincerely thank anybody who invests time in answering this. Around February 9, 2013, we launched our new site on the Bigcommerce platform. We moved from Volusion after 6 years. We had paid the Bigcommerce partner for an upgraded 301 redirect package as I was thoroughly concerned about losing rankings. By the end of February our rankings were diminishing. We expected a slight dip due to the new site. As of May, our organic traffic had dropped by 82%. Google WMT is showing 1500+ 404 errors. Many have to do with review page type URLs and some were just plain never redirected apparently. In May, we hired a wonderful SEO company that is a heavy contributor to the Moz community. They have been generous and wonderful to work with. By the end of this last week it was determined that most of the coding suggestions our SEO was making could NOT be implemented in Bigcommerce because Bigcommerce will not allow access to the PHP files by our developer, thus hindering the execution of these suggestions. Some of these were move the blog to the root, use canonical on the home page, use canonical for pagination, stop the indexing of https URLs and a few more. Today, June 25 we are at a complete loss and trying to just keep our business alive. The opinion of both the SEO and the developer is that my choice of Bigcommerce as a platform was not the best. So my main question is what are the odds our rankings have decreased due to the lack of 301 redirects during our migration to Bigcommerce versus the rankings decreasing do to Bigcommerce being a bad choice as a platform? We are being advised to redevelop our entire site on an Open Source platorm such as Wordpress or Magento, but if that's not needed I certainly don't want to have to do that. I hope I have provided a decent amount of history and information. Thank you for any help/advice you are willing to offer.
Web Design | | josh3300 -
Does changing nameservers and a new site design affect SEO dramatically
We are about to change nameservers and upload a new website design design, completely rebuilt website to that new hosting, will this effect our seo efforts previously and have an effect on our SEO rankings?
Web Design | | CompleteOffice0