Google has indexed a lot of test pages/junk from the development days.
-
With hind site I understand that this could have been avoided if robots.txt was configured properly.
My website is www.clearvisas.com, and is indexed with both the www subdomain and with out.
When I run site:clearvisas.com in Google I get 1,330 - All junk from the development days.
But when I run site:www.clearvisas.com in Google I get 66 - these results all post development and more in line with what I wanted to be indexed.
Will 1,330 junk pages hurt my seo?
Is it possible to de-index them and should I?
If the answer is yes to any of the questions how should I proceed?
Kind regards,
Fuad
-
Thanks Ryan.
-
It's impossible to say conclusively without examining your site and the content; however, since you refer to them as "junk" pages, it is likely they should best be removed to protect your other pages.
-
Thanks Ryan.
Are the un-wanted/irrelevant pages likely to affect my organic seo?
-
Thanks for your view David, its much appreciated. Thanks, Fuad
-
I would suggest following option 3 from David's recommendations.
Simply add the "noindex" tag to the pages you want removed from Google. The pages will then be removed the next time they are crawled.
You are correct the issue could have been avoided by blocking the site during development, which is a recommended practice; however, it is recommended to minimize entries in the robots.txt file of a live site. You can add the pages in robots.txt and Google can still index them.
The above applies if you feel the need to keep the pages around. If you no longer need those pages, removing them and providing a 410 error (GONE) would be the best approach.
-
Go to Google Webmaster Tools => Optimization => Remove URLS
In order for Google to remove the URL, you will need to do 1 of the following:
1. Block it with robots.txt, but it sounds like it's too late for that.
2. If you removed the old development content, make sure that the old content's URL produces a 404 or 410 status code.
3. Block the content with a Meta noncontent tag.
In my opinion, option 2 is the easiest since you should have a 404 page anyway.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Many meta descriptions ignored by Google
Hi all, We have recently added the meta descriptions for more than 50 pages of our website. It's been more than a week and all the pages have been indexed. But still I can see most of the pages in Google results didn't show up with recently added meta description, but the content from page like how it used to be. I wonder what's wrong with this scenario. Please guide of someone aware of this. Thanks
Algorithm Updates | | vtmoz0 -
Google indexing https sites by default now, where's the Moz blog about it!
Hello and good morning / happy Friday! Last night an article from of all places " Venture Beat " titled " Google Search starts indexing and letting users stream Android apps without matching web content " was sent to me, as I read this I got a bit giddy. Since we had just implemented a full sitewide https cert rather than a cart only ssl. I then quickly searched for other sources to see if this was indeed true, and the writing on the walls seems to indicate so. Google - Google Webmaster Blog! - http://googlewebmastercentral.blogspot.in/2015/12/indexing-https-pages-by-default.html http://www.searchenginejournal.com/google-to-prioritize-the-indexing-of-https-pages/147179/ http://www.tomshardware.com/news/google-indexing-https-by-default,30781.html https://hacked.com/google-will-begin-indexing-httpsencrypted-pages-default/ https://www.seroundtable.com/google-app-indexing-documentation-updated-21345.html I found it a bit ironic to read about this on mostly unsecured sites. I wanted to hear about the 8 keypoint rules that google will factor in when ranking / indexing https pages from now on, and see what you all felt about this. Google will now begin to index HTTPS equivalents of HTTP web pages, even when the former don’t have any links to them. However, Google will only index an HTTPS URL if it follows these conditions: It doesn’t contain insecure dependencies. It isn’t blocked from crawling by robots.txt. It doesn’t redirect users to or through an insecure HTTP page. It doesn’t have a rel="canonical" link to the HTTP page. It doesn’t contain a noindex robots meta tag. It doesn’t have on-host outlinks to HTTP URLs. The sitemaps lists the HTTPS URL, or doesn’t list the HTTP version of the URL. The server has a valid TLS certificate. One rule that confuses me a bit is : **It doesn’t redirect users to or through an insecure HTTP page. ** Does this mean if you just moved over to https from http your site won't pick up the https boost? Since most sites in general have http redirects to https? Thank you!
Algorithm Updates | | Deacyde0 -
How to follow up Google search in a Joomla website
Hello ! I'm facing a stupid issue but cannot solve it. Google Analytics can track the seach into a website. When I try to activate it I need to add some more parameters that I do not know. http://awesomescreenshot.com/03f1bnau8d Has anyone ever tried to configure Google analytics search for a Joomla website ? Tks a lot !
Algorithm Updates | | AymanH0 -
Does a KML file have to be indexed by Google?
I'm currently using the Yoast Local SEO plugin for WordPress to generate my KML file which is linked to from the GeoSitemap. Check it out http://www.holycitycatering.com/sitemap_index.xml. A competitor of mine just told me that this isn't correct and that the link to the KML should be a downloadable file that's indexed in Google. This is the opposite of what Yoast is saying... "He's wrong. 🙂 And the KML isn't a file, it's being rendered. You wouldn't want it to be indexed anyway, you just want Google to find the information in there. What is the best way to create a KML? Should it be indexed?
Algorithm Updates | | projectassistant1 -
Google penalty for one keyword?
Is it possible to get penalized by Google for a specific keyword and essentially disappear from the SERPs for that keyword but keep position for the brand (#1) and some other keywords (#4 and #7)? And how would you find out that this is what happened if there is no GWT message?
Algorithm Updates | | gfiedel0 -
Trying to figure out why one of my popular pages was de-indexed from Google.
I wanted to share this with everyone for two reasons. 1. To try to figure out why this happened, and 2 Let everyone be aware of this so you can check some of your pages if needed. Someone on Facebook asked me a question that I knew I had answered in this post. I couldn't remember what the url was, so I googled some of the terms I knew was in the page, and the page didn't show up. I did some more searches and found out that the entire page was missing from Google. This page has a good number of shares, comments, Facebook likes, etc (ie: social signals) and there is certainly no black / gray hat techniques being used on my site. This page received a decent amount of organic traffic as well. I'm not sure when the page was de-indexed, and wouldn't have even known if I had't tried to search for it via google; which makes me concerned that perhaps other pages are being de-indexed. It also concerns me that I have done something wrong (without knowing) and perhaps other pages on my site are going to be penalized as well. Does anyone have any idea why this page would be de-indexed? It sure seems like all the signals are there to show Google this page is unique and valuable. Interested to hear some of your thoughts on this. Thanks
Algorithm Updates | | NoahsDad0 -
How do I get the expanded results in a Google search?
I notice for certain site (ex: mint.com) that when I search, the top result has a very detailed view with options to click to different subsections of the site. However for my site, even though we're consistently the top result for our branded terms, the result is still only a single line item. How do I adjust this?
Algorithm Updates | | syount1 -
Are you getting any action from Google +1 ?
If you have added google plus one to your website you can check on the impact by visiting your webmaster tools account. In your GWT account you will see a left menu item for "+1 Metrics". If you click on "Search Impact" you can see the CTR change attributed to +1. Anybody seeing anything there yet?
Algorithm Updates | | EGOL0