Dev Site Was Indexed By Google
-
Two of our dev sites(subdomains) were indexed by Google. They have since been made private once we found the problem. Should we take another step to remove the subdomain through robots.txt or just let it ride out?
From what I understand, to remove the subdomain from Google we would verify the subdomain on GWT, then give the subdomain it's own robots.txt and disallow everything.
Any advice is welcome, I just wanted to discuss this before making a decision.
-
We ran into this in the past, and one thing that we (think) happened is that the links to the dev site were sent via email to several gmail accounts. We think this is how Google then indexed the site, as there were no inbound links posted anywhere.
I think that the main issue is how it's perceived by the client, and if they are freaking out about it. In that case, using an access control password to prevent anyone from coming to the site will limit anyone from seeing it.
The robot.txt file should flush it out, but yes, it takes a little bit of time.
-
I've had this happen before. In the dev subdomain, I added a robots.txt that excluded everything, verified the subdomain as its own site in GWT, then asked for that site (dev subdomain) to be removed.
I then went and used a free code monitoring service that checked for code changes of a URL once a day. I set it up to check the live site robots.txt and the robots.txt of all of the dev sites, so I'd know within 24 hours if the developers had tweaked the robots.txt.
-
Hi Tyler,
You definitely don't want to battle yourself for duplicate content. If the current sub-domains have little link juice (in links) to them, I would simply block the domain from being further indexed. If there are a couple pages that are of high value it maybe worth the time to use a 301 redirect to prevent losing any links / juice.
Using robots.txt or noindex / tags may work, but in my personal experience the easiest and most efficient way to block any indexing is simply use .htaccess / .htpasswrd this will prevent anybody without credentials from even viewing your site effectively blocking all spiders / bots and unwanted snoopers.
-
Hey Tyler,
We would follow the same protocol if in your shoes. Remove any instance of the indexed dev subdomain(s), then create your new robot.txts files for each subdomain and disavow any indexed content/links as an extra step. Also, double check and even resubmit your root domain's XML sitemap so Google can reindex your main content/links as a precautionary measure.
PS - We develop on a separate server and domain for any new work for our site or any client sites. Doing this allows us to block Google from everything.
Hope this was helpful! - Patrick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Home Pages of Several Websites are disappearing / reappearing in Google Index
Hi, I periodically use the Google site command to confirm that our client's websites are fully indexed. Over the past few months I have noticed a very strange phenomenon which is happening for a small subset of our client's websites... basically the home page keeps disappearing and reappearing in the Google index every few days. This is isolated to a few of our client's websites and I have also noticed that it is happening for some of our client's competitor's websites (over which we have absolutely no control). In the past I have been led to believe that the absence of the home page in the index could imply a penalty of some sort. This does not seem to be the case since these sites continue to rank the same in various Google searches regardless of whether or not the home page is listed in the index. Below are some examples of sites of our clients where the home page is currently not indexed - although they may be indexed by the time you read this and try it yourself. Note that most of our clients are in Canada. My questions are: 1. has anyone else experienced/noticed this? 2. any thoughts on whether this could imply some sort of penalty? or could it just be a bug in Google? 3. does Google offer a way to report stuff like this? Note that we have been building websites for over 10 years so we have long been aware of issues like www vs. non-www, canonicalization, and meta content="noindex" (been there done that in 2005). I could be wrong but I do not believe that the site would keep disappearing and reappearing if something like this was the issue. Please feel free to scrutinize the home pages to see if I have overlooked something obvious - I AM getting old. site:dietrichlaw.ca - this site has continually ranked in the top 3 for [kitchener personal injury lawyers] for many years. site:burntucker.com - since we took over this site last year it has moved up to page 1 for [ottawa personal injury lawyers] site:bolandhowe.com - #1 for [aurora personal injury lawyers] site:imranlaw.ca - continually ranked in the top 3 for [mississauga immigration lawyers]. site:canadaenergy.ca - ranks #3 for [ontario hydro plans] Thanks in advance! Jim Donovan, President www.wethinksolutions.com
Technical SEO | | wethink0 -
Page disappeared from Google index. Google cache shows page is being redirected.
My URL is: http://shop.nordstrom.com/c/converse Hi. The week before last, my top Converse page went missing from the Google index. When I "fetch as Googlebot" I am able to get the page and "submit" it to the index. I have done this several times and still cannot get the page to show up. When I look at the Google cache of the page, it comes up with a different page. http://webcache.googleusercontent.com/search?q=cache:http://shop.nordstrom.com/c/converse shows: http://shop.nordstrom.com/c/pop-in-olivia-kim Back story: As far as I know we have never redirected the Converse page to the Pop-In page. However the reverse may be true. We ran a Converse based Pop-In campaign but that used the Converse page and not the regular Pop-In page. Though the page comes back with a 200 status, it looks like Google thinks the page is being redirected. We were ranking #4 for "converse" - monthly searches = 550,000. My SEO traffic for the page has tanked since it has gone missing. Any help would be much appreciated. Stephan
Technical SEO | | shop.nordstrom0 -
Can I have an http AND a https site on Google Webmaster tools
My website is https but the default property that was configured on Google WMT was http and wasn't showing me any information because of that. I added an https property for that, but my question is: do I need to delete the original HTTP or can I leave both websites?
Technical SEO | | Onboard.com0 -
Pages to be indexed in Google
Hi, We have 70K posts in our site but Google has scanned 500K pages and these extra pages are category pages or User profile pages. Each category has a page and each user has a page. When we have 90K users so Google has indexed 90K pages of users alone. My question is. Should we leave it as they are or should we block them from being indexed? As we get unwanted landings to the pages and huge bounce rate. If we need to remove what needs to be done? Robots block or Noindex/Nofollow Regards
Technical SEO | | mtthompsons0 -
I am trying to figure out why a website is not getting fully indexed by google. Any ideas?
I am trying to figure out why a website is not getting fully indexed by google. The website was built with Godaddy's website designer so maybe this is the problem. Originally, the internal links throughout the navigation were linked to “pages” within the site. I went in and changed all of these navigation links to point to the actual url links throughout the site instead of relative links pointing to pages on the server. I thought this would have solved the problem because I thought that perhaps google was not able to follow the original relative links. When I check to see how many pages are in the google index I still see the same #. What is going on? Should this website be rebuilt using more search engine friendly code like wordpress? Is there a simple fix that will enable google to find all of this content created by Godaddy design software? I appreciate any help offered. Here is the site- http://www.securehomeusa.com/
Technical SEO | | ULTRASEM0 -
Site not ranking in Google but comes up #1 in Yahoo and Bing
Hi everyone, I've been working on SEO for this site for about 2 years and for some reason the site has just tanked in google. However it shows up #1 in yahoo and bing for the same search. http://www.nfsmn.com Phrase: "commercial foundation repair mn" If anyone can shed some light on the issue I would really appreciate it. They do have a sister-site: american-waterworks.com that may be causing issues as they link a lot of content to amww but not the other way around. Thanks Eric
Technical SEO | | reynoldsdesign0 -
Google Duplicate Content Penalty On My Own Site?
I am certain that I have hit a google penalty filter for my site http://www.playpokeronline.ca for my main keywords "play poker online" in google.ca I rank 670th and used to be on the first page between 1 and 10 in June. On Bing I am like 9th On my site I found the entire site duplicated as follows Original: www.playpokeronline.ca Duplicate www.playpokeronline.ca/playpokeronline/ this duplicate was not intentional and seems to be a result of my hosting at godaddy. for every page on my site and it shows up in webmaster tools I blocked the duplicate with robots.txt and a few days ago dropped it and wrote a rel=connonical tag in the top of each page visitors dropped from 100 per day in august to 12-20 in the last month. Google says that if duplicate content is made to try to game serps they may filter or penalize my site. Have I triggered this penalty or a different sort of over optimization penalty? Will the rel= canonical tags fix this or should i do something else? This Penalty Business is Not my Idea of a good time Thank You Jeb
Technical SEO | | PokerCanada0 -
Dynamically-generated .PDF files, instead of normal pages, indexed by and ranking in Google
Hi, I come across a tough problem. I am working on an online-store website which contains the functionlaity of viewing products details in .PDF format (by the way, the website is built on Joomla CMS), now when I search my site's name in Google, the SERP simply displays my .PDF files in the first couple positions (shown in normal .PDF files format: [PDF]...)and I cannot find the normal pages there on SERP #1 unless I search the full site domain in Google. I really don't want this! Would you please tell me how to figure the problem out and solve it. I can actually remove the corresponding component (Virtuemart) that are in charge of generating the .PDF files. Now I am trying to redirect all the .PDF pages ranking in Google to a 404 page and remove the functionality, I plan to regenerate a sitemap of my site and submit it to Google, will it be working for me? I really appreciate that if you could help solve this problem. Thanks very much. Sincerely SEOmoz Pro Member
Technical SEO | | fugu0