Google indexing staging / development site that is redirected...
-
Hi Moz Fans! - Please help.
We had a acme.stagingdomain.com while a site was in development, when it went live it redirected (302) to acmeprofessionalservices.com (real names redacted!!)
no known external links to staging site
although staging site url has been emailed from Google Apps(!!!)
now found that staging site is in the index even though it redirects to the proper public site.
and some (but not all) of the pages are in the index too. They all redirect to the proper public site when visited.
It is convenient to have a redirect from the staging site to the new one for the team, Chrome etc. remember frequently visited sites. Be a shame to lose that.
Yes, these pages can be removed using webmaster tools.
But how did they get in the index to start with?And if we're building a new site, and a customer has an existing site is there a danger of duplicate content etc. penalties caused by the staging site?
We had a similar incident recently when a PDF that was not linked anywhere on the site appeared in the index. The link had been emailed through Google Apps, and visited in Chrome, but that was it.
So 3 questions.
Why is the staging site still in the index despite the redirects?
How did they get in the index in the first place?
Will the new staging site affect the rank of the existing site, eg. duplicate content penalties?
-
Hi There
1. It could still be in the index because they are 302 redirect and not 301. 302 is temporary, and therefore Google may not de-index those URLs. It also takes time. I've seen Google take months to noindex redirecting URLs. Also, make sure you are not blocking crawling of the dev site, or Google will not see the redirects.
2. I am not sure how they got there to begin with. I pretty much always can find some sort of error - maybe someone tweeted a staging URL, maybe crawling wasn't blocked, maybe there was one link to staging from the live site etc etc. Regardless - somehow Google crawled it To prevent this in the future always block crawling of staging servers well before you ever put anything on them.
3. Usually Google tries to sort this out. They won't give you a penalty for "technical" duplicate content (penalties are more for "malicious" duplicate content ie: stealing people's content). So you won't get penalized, but the more you can help Google out by sorting it out, the more time Google can spend crawling the correct site etc.
What I would do now is, if you do want the staging URLs to redirect (which might not be the best solution if you want to ever go back and work on the staging server again) - but if you do, use 301 redirects and make sure you are allowing crawling of the staging site. Keep it registered in webmaster tools and this way you can monitor the indexation levels.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can I redirect a link even if the link is still on the site
Hi Folks, I've got a client who has a duplicate content because they actually create duplicate content and store the same piece of content in 2 different places. When they generate this duplicate content, it creates a 2nd link on the site going to the duplicate content. Now they want the 2nd link to always redirect to the first link, but for architecture reasons, they can't remove the 2nd link from the site navigation. We can't use rel-canonical because they don't want visitors going to that 2nd page. Here is my question: Are there any adverse SEO implications to maintaining a link on a site that always redirects to a different page? I've already gone down the road of "don't deliberately create duplicate content" with the client. They've heard me, but won't change. So, what are your thoughts? Thanks!
Technical SEO | | Rock330 -
What can i do to get google to visit my site more often
Hi, i am having serious problems since i upgraded my website from joomla 1.5 to 3.0 We have dropped down the rankings from page one for the word lifestyle magazine, and we have dropped down in rankings for other very important words including gastric band hypnotherapy and i am starting to regret having the site upgraded. i am finding the google is taking its time visiting my site, i know this for two reasons, one i have checked the cache and it is showing the 2nd july and i have checked articles that we have written and they are still not showing. example if i put this article name in word for word it does not come up, Carnival Divert Ships In The Caribbean Due To bad Weather this was an article that was done yesterday. in the old days before the upgrade that would have been in google now. these problems are costing us the loss of a great deal of traffic, we are losing around 70% of our traffic since the upgrade and would be grateful if people could give me advice on how to turn things around. we add articles all the time. each day we add a number of articles, i was considering changing the front page in the middle and having a few paragraphs of the latest story to get google to visit more often. i know this would look messy but i am running out of ideas. any help would be great
Technical SEO | | ClaireH-1848860 -
Static site to wordpress - avoiding 301 redirects
Moving our static website to wordpress, pages currently end in the .htm extension and for reasons of me having to do all the moving myself and wanting to preserve link equity is there any way I can run the pages with a .htm extension in Wordpress? Tried using a plug-in by Daddy Design but it seems a bit hit and miss at times. I basically need to keep the url's the same as I will not be able to get the vast majority of my links altered to the new pages, plus I am doing this by myself!
Technical SEO | | Jon-C0 -
Want to Target Mobile site for Google Mobile Version and Desktop Site for Google Desktop Version
I have ecommerce site with both mobile version and desktop version. Mobile version starts with m.example.com and full version starts with www.example.com I am using same content through out both site and using 301 redirection by detecting user agent vice-versa. My both sites are accessible to crawl by any google spider. I have submitted both sites's sitemap to GWT and mobile site having mobile sitemap xml, so google can easily recognize my mobile site. Is it going to help to rank my both sites as per my expectation? I need to rank for mobile site in Google mobile and ranking for desktop site in Google desktop version. Some of pages of my mobile site are started to appearing in Google desktop version. So how I can stop them to appear in Google desktop? Your comments are highly welcome.
Technical SEO | | Hexpress0 -
Homepage/Root domain de-indexed by Google
This morning I discovered that the homepage/root domain of our company site, http://www.collegeplus.org/, has been de-indexed by Google and Bing. Out IT dept. is claiming it's our fault because we changed the meta title on our homepage. But they will not give me access to GWT to see if there's any issues. I believe the issue lies within our robots.txt file - http://www.collegeplus.org/robots.txt I also don't believe we're suffering a penalty because all of our tier 2 pages are still indexed when any type of branded search is performed. We don't do things that can get a site de-indexed like this. Any ideas on what the issue may be? Or at least something to convince our IT dept. that simply changing a meta title won't get your homepage totally de-indexed? Thanks.
Technical SEO | | explorionary0 -
Google not visiting my site
Hi my site www.in2town.co.uk which is a lifestyle magazine has gone under a major refit. I am still working on it but it should be ready by the end of this week or sooner but one problem i have is, google is not visiting the site. I took a huge gamble to redo the site, even though before the refit i was getting a few thousand visitors a day, i wanted to make the site better as i was getting google webmaster errors. But now it seems google is not visiting the site. for example i am using sh404sef and i have put friendly url in the site and on the home page it has its name and meta tag but when you look at google it is not giving the site a name. Also it has not visited the site since october 13th Can anyone advise how to encourage google to visit the site please.
Technical SEO | | ClaireH-1848860 -
How long does it take for Google to de-index urls?
Added the noindex meta tag to some pages on my site and I am wondering if anyone has any idea how long it will take to deindex the urls?
Technical SEO | | nicole.healthline0 -
Google News not indexing .index.html pages
Hi all, we've been asked by a blog to help them better indexing and ranking on Google News (with the site being already included in Google News with poor results) The blog had a chronicle URL duplication problem with each post existing with 3 different URLs: #1) www.domain.com/post.html (currently in noindex for editorial choices as showing all the comments) #2) www.domain.com/post/index.html (currently indexed showing only top comments) #3) www.domain.com/post/ (very same as #2) We've chosen URL #2 (/index.html) as canonical URL, and included a rel=canonical tag on URL #3 (/) linking to URL #2.
Technical SEO | | H-FARM
Also we've submitted yesterday a Google News sitemap including consistently the list of URLs #2 from the last 48h . The sitemap has been properly "digested" by Google and shows that all URLs have been sent and indexed. However if we use the site:domain.com command on Google News we see something completely different: Google News has indexed actually only some news and more specifically only the URLs #3 type (ending with the trailing slash instead of /index.html). Why ? What's wrong ? a) Does Google News bot have problems indexing URLs ending with .index.html ? While figuring out what's wrong we've found out that http://news.google.it/news/search?aq=f&pz=1&cf=all&ned=us&hl=en&q=inurl%3Aindex.html gives no results...it seems that Google News index overall does not include any URLs ending with /index.html b) Does Google News bot recognise rel=canonical tag ? c) Is it just a matter of time and then Google News will pick up the right URLs (/index.html) and/or shall we communicate Google News team any changes ? d) Any suggestions ? OR Shall we do the other way around. meaning make URL #3 the canonical one ? While Google News is showing these problems, Google Web search has actually well received the changes, so we don't know what to do. Thanks for your help, Matteo0