Google has indexed a lot of test pages/junk from the development days.
-
With hind site I understand that this could have been avoided if robots.txt was configured properly.
My website is www.clearvisas.com, and is indexed with both the www subdomain and with out.
When I run site:clearvisas.com in Google I get 1,330 - All junk from the development days.
But when I run site:www.clearvisas.com in Google I get 66 - these results all post development and more in line with what I wanted to be indexed.
Will 1,330 junk pages hurt my seo?
Is it possible to de-index them and should I?
If the answer is yes to any of the questions how should I proceed?
Kind regards,
Fuad
-
Thanks Ryan.
-
It's impossible to say conclusively without examining your site and the content; however, since you refer to them as "junk" pages, it is likely they should best be removed to protect your other pages.
-
Thanks Ryan.
Are the un-wanted/irrelevant pages likely to affect my organic seo?
-
Thanks for your view David, its much appreciated. Thanks, Fuad
-
I would suggest following option 3 from David's recommendations.
Simply add the "noindex" tag to the pages you want removed from Google. The pages will then be removed the next time they are crawled.
You are correct the issue could have been avoided by blocking the site during development, which is a recommended practice; however, it is recommended to minimize entries in the robots.txt file of a live site. You can add the pages in robots.txt and Google can still index them.
The above applies if you feel the need to keep the pages around. If you no longer need those pages, removing them and providing a 410 error (GONE) would be the best approach.
-
Go to Google Webmaster Tools => Optimization => Remove URLS
In order for Google to remove the URL, you will need to do 1 of the following:
1. Block it with robots.txt, but it sounds like it's too late for that.
2. If you removed the old development content, make sure that the old content's URL produces a 404 or 410 status code.
3. Block the content with a Meta noncontent tag.
In my opinion, option 2 is the easiest since you should have a 404 page anyway.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
ATTN SEO MINDS: Is there a way/tool to categorize keywords from an Omniture/GA report?
So ideally I would like to take the list of keywords I am currently ranking for, and group these based on what the user intent was in making that query. For example if I am a Thai delivery chain and I am currently receiving traffic from the queries "vegan dish" and "tofu thai food", I would want to have a column in a keyword report that says these queries fall into the VEGETARIAN category. I think what I want to know is how can I filter a massive list by a range of keywords? I want to know does this cell contain, "keyword A" or "keyword B" or "keyword Z". If so list the corresponding category. This way I can look at keyword performance by category or user intent/motivation. Is there a tool out there that will help me accomplish this, or is there a good solution in excel I can use?
Algorithm Updates | | Jonathan.Smith0 -
Is it possible that Google may have erroneous indexing dates?
I am consulting someone for a problem related to copied content. Both sites in question are WordPress (self hosted) sites. The "good" site publishes a post. The "bad" site copies the post (without even removing all internal links to the "good" site) a few days after. On both websites it is obvious the publishing date of the posts, and it is clear that the "bad" site publishes the posts days later. The content thief doesn't even bother to fake the publishing date. The owner of the "good" site wants to have all the proofs needed before acting against the content thief. So I suggested him to also check in Google the dates the various pages were indexed using Search Tools -> Custom Range in order to have the indexing date displayed next to the search results. For all of the copied pages the indexing dates also prove the "bad" site published the content days after the "good" site, but there are 2 exceptions for the very 2 first posts copied. First post:
Algorithm Updates | | SorinaDascalu
On the "good" website it was published on 30 January 2013
On the "bad" website it was published on 26 February 2013
In Google search both show up indexed on 30 January 2013! Second post:
On the "good" website it was published on 20 March 2013
On the "bad" website it was published on 10 May 2013
In Google search both show up indexed on 20 March 2013! Is it possible to be an error in the date shown in Google search results? I also asked for help on Google Webmaster forums but there the discussion shifted to "who copied the content" and "file a DMCA complain". So I want to be sure my question is better understood here.
It is not about who published the content first or how to take down the copied content, I am just asking if anybody else noticed this strange thing with Google indexing dates. How is it possible for Google search results to display an indexing date previous to the date the article copy was published and exactly the same date that the original article was published and indexed?0 -
Organic Search Result in google
Hello! Actually, I would like to know the major check points which decides the organic search result[Google]. I see many of the sites in first page which are not even having good level of page and domain authority. I am a beginner but i have done all the score card checkpoints and issue free pages 🙂 Some where i dropped on organic search result. Ex Keyword : blikkenslager Targeted page : http://www.nortekk.no/vi-utforer/blikkenslager-15/ Search Engine : Google.no [norsk (nynorsk)] Thank you for your help!
Algorithm Updates | | Webworld_Norway0 -
How to fix Yahoo/Bing Ranking with hurting great Google ranking
If you have a Top ranking for keyword in Google but for Bing and Yahoo you rank considerably lower how do you balance the desire to rank better in Yahoo/Bing with not wanting to damage your Google ranking? Have people found certain on page SEO tactics help one but damage the other? Does anyone else have great Google rankngs for keywords but Bing/Yahoo are mediocre to poor?
Algorithm Updates | | inhouseninja0 -
Does Google or Bing use words in the page title beyond the displayed limit for ranking purposes?
Standard good practice for on-page SEO includes keeping page title length below the maximum that Google displays in the SERPs. But words in the title beyond that maximum can be indexed, even if they don't show in the SERPs for end users. For ranking purposes, is there any value in words beyond the character limit in page titles that are truncated in the SERPs?
Algorithm Updates | | KyleJB0 -
Why does Google say they have more URLs indexed for my site than they really do?
When I do a site search with Google (i.e. site:www.mysite.com), Google reports "About 7,500 results" -- but when I click through to the end of the results and choose to include omitted results, Google really has only 210 results for my site. I had an issue months back with a large # of URLs being indexed because of query strings and some other non-optimized technicalities - at that time I could see that Google really had indexed all of those URLs - but I've since implemented canonical URLs and fixed most (if not all) of my technical issues in order to get our index count down. At first I thought it would just be a matter of time for them to reconcile this, perhaps they were looking at cached data or something, but it's been months and the "About 7,500 results" just won't change even though the actual pages indexed keeps dropping! Does anyone know why Google would be still reporting a high index count, which doesn't actually reflect what is currently indexed? Thanks!
Algorithm Updates | | CassisGroup0 -
How do I rank multiple pages for my busness/domain name?
When someone searches for our business's name (which is also the domain name) we have one listing (with sitelinks) at the top - however I would also like to rank 2nd, 3rd and 4th for this term. Any suggestions on how this might be done? Thanks.
Algorithm Updates | | CaBStudios0 -
Panda / Penguin Behavior ? Recovery?
Our site took a major fall on March 23rd, ie Panda 3.4 and then another smaller one on April 24th, ie Penguin. I have posted a few times in here trying get help on what items to focus on. Been doing this for 13 years, white hat, never chased algos but of course learned as I went. As soon as the fall hit one expert said it was links, which I kinda doubted because we never went after them but we have some but only a handful in comparison to really good authorative links. I concentrated on cleaning up duplicate content due to tags in a blog that only had 7 posts (an add on section to the site) then focuses efforts on just going through and making content better. Had other overlapping content that I would guess would pass inspection but I cleaned it up. After 6 weeks no movement back up, another expert here said yes, he saw some bad links so I should check it out. So back to focusing on links, I actually run a report and discover questionable links, and successfully get about 25 removed. Low numbers but we have only about 50 that were questionable. No contact info on the other directories so I guess we are stuck. Here is where I just go in circles... When our site fell on March 23rd we had 13 of our main pages still ranking at number 1 and 2 on each keyword phrase. Penguin hit and they fell about 10 spots. EXCEPT, one... This one keyword phrase and page stayed on top and ranked at #1 throught he storm. (finally fell to #4 but still remains up there). The whole site is down 90%, we only have 3 fair keyword phrases really ranking out of 250. The mystery is that the keyword phrase that was ranking was the one that supposedly had way over the % of anchor text, 7% of our links go to that page. The other pages that fell on Penguin had no pages linking back. I have been adding blog posts to our site, I post one an in a few days it gets indexed, have one of those ranking at #2 for the keyword, moved up from #4 a week after posting it in the blog. (google searches shows 80K) Just seems like the site should bounce back if new content is able to rank, why not the old? Did other people hit by Panda and Penguin see a sitewide fall or are they still ranking for some terms? I would love to see some discusson on success stories of bouncing back after Panda and Penguin. I see the WP success story but that was pretty sudden after it was brought to Google's attention. Looking for that small business that fixed something and saw improvement. Give me hope here please.
Algorithm Updates | | Force70