Quickest way to deindex a large number of pages
-
Our site was recently hacked by spammers posting fake content and bringing down our servers, etc. After a few months, we finally figured out what was going on and fixed the issue. However, it turns out that Google has indexed 26K+ spammy pages and we've lost page rank and search engine rankings as a result.
What is the best and fastest way to get these pages out of Google's index?
-
Given that I'm sure you've removed these pages from your site, there will be no page to which to add a meta-noindex tag.
Disallowing these pages in robots.txt in no way signals to the search engines that they should be removed from the index, just that they should no longer be crawled. Given that they're already indexed, blocking in robots.txt would potentially save some "crawl budget" but wouldn't do anything to remove them from the index.
So submitting them to the URL Removal Tool would be by far the most effective, along with an explanation.
You'll also want to keep a very close watch on your penalty warnings within Webmaster Tools. If you get flagged, you'll want a complete history of the issue and the steps you've taken to address it in order to prepare a reinclusion request.
Lastly, don't forget to submit these same URLs to the Bing Webmaster Tools Block URLs tool. You may not get a massive amount of traffic from Bing, but there's no sense throwing it away, since you've already prepared the URL removal list anyway.
Hope that helps?
Paul
-
Yup. Just wanted to add as well that if these pages are in a particular directory, then you can deindex the entire directory in one command using the URL removal tool.
-
Disallow in robots.txt
Add a noindex meta tag to these pages
Request Google to remove the URLs from their index via WMT URL removal request
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How much content is duplicate content? Differentiate between website pages, help-guides and blog-posts.
Hi all, I wonder that duplicate content is the strong reason beside our ranking drop. We have multiple pages of same "topic" (not exactly same content; not even 30% similar) spread across different pages like website pages (product info), blog-posts and helpguides. This happens with many websites and I wonder is there any specific way we need to differentiate the content? Does Google find the difference across website pages and blog-pots of same topic? Any good reference about this? Thanks
Algorithm Updates | | vtmoz0 -
Internal pages ranking over the homepage: How to optimise to rank better at Google?
Hi, We have experienced a shift in SERP from internal pages ranking over website homepage for more than a year. Previously website homepages used to rank for the primary keyword like moz.com for "SEO". Now we can see that internal pages like moz.com/learn/seo/what-is-seo been ranking for the primary keyword "SEO". Google is picking up these "what is ABC" pages than the homepage. All our competitor sites are ranking with these internal pages which are about "what is (primary keyword)". We do have the same internal pages "what is....", but this pages is not ranking; only our homepage is ranking. Moreover we dropped more than 15 positions after this shift in SERP. How to diagnose this? Thanks
Algorithm Updates | | vtmoz0 -
Clicks are the ultimate factor to stick the page on position?
Hi all, We know many factors contribute to make a page rank at (top) position like somewhere in top 5 results. I have seen some of our pages suddenly spike to that positions and locked there. They been receiving clicks too. Will they be dropped if they don't get estimated clicks? I think many factors contribute to make a page rank higher but clicks are the one factor which makes the page consistently rank at its best position. What do you say? Thanks
Algorithm Updates | | vtmoz0 -
Home page rank for keyword
Hi Mozers I have traded from my website balloon.co.uk for over 10 years. For a long while the site ranked first for the word 'balloon' across the UK on google.co.uk (first out of 41 million). Around the time Penguin launched the site began to drop and currently sits on about page 5. What's confusing is that for a search on 'balloons' ('s' on the end of balloon) it ranks 2nd in the location of Birmingham where I'm based. That's 2nd in the real search rather than a map local search. But - if I search 'balloon' from the location of Birmingham my contact page ranks 5th: http://www.balloon.co.uk/contact.htm but the home page ranks nowhere. So - it's gone from ranking 1st nationally to ranking nowhere with my contact page ranking above the home page (which is a generic word domain). Any ideas?
Algorithm Updates | | balloon.co.uk0 -
How much is Page Rank really worth?
We are in a position to purchase a domain, made of relevant keywords to our company with a current page ranking of 4 for their home page. However in looking at their analytics and other information they do not do well on significant keywords and have very low site traffic. In fact they do very, very poorly. With their high page ranking would it be relatively easy to conduct a successful SEO campaign on the domain if we were to take it over as our own and attempt to climb in the SERP's? I know Page Rank doesn't mean everything when it comes to your ranking, but 4 is relatively high in our field, so I don't really understand why they do so poorly when it comes to their actual rankings on key words.
Algorithm Updates | | absoauto0 -
Long term plan for a large htaccess file with 301 redirects
We setup a pretty large htaccess file in February for a site that involved over 2,000 lines of 301 redirects from old product url's to new ones. The 'old urls' still get a lot of traffic from product review sites and other pretty good sites which we can't change. We are now trying to reduce the page load times and we're ticking all of the boxes apart from the size of the htaccess file which seems to be causing a considerable hang on load times. The file is currently 410kb big! My question is, what should I do in terms of a long terms strategy and has anyone came across a similar problem? At the moment I am inclined to now remove the 2,000 lines of individual redirects and put in a 'catch all' whereby anything from the old site will go to the new site homepage. Example code: RedirectMatch 301 /acatalog/Manbi_Womens_Ear_Muffs.html /manbi-ear-muffs.html
Algorithm Updates | | gavinhoman
RedirectMatch 301 /acatalog/Manbi_Wrist_Guards.html /manbi-wrist-guards.html There is no consistency between the old urls and the new ones apart from they all sit in the subfolder /acatalog/0 -
Google has indexed a lot of test pages/junk from the development days.
With hind site I understand that this could have been avoided if robots.txt was configured properly. My website is www.clearvisas.com, and is indexed with both the www subdomain and with out. When I run site:clearvisas.com in Google I get 1,330 - All junk from the development days. But when I run site:www.clearvisas.com in Google I get 66 - these results all post development and more in line with what I wanted to be indexed. Will 1,330 junk pages hurt my seo? Is it possible to de-index them and should I? If the answer is yes to any of the questions how should I proceed? Kind regards, Fuad
Algorithm Updates | | Fuad_YK0 -
Google and Content at Top of Page Change?
We always hear about how Google made this change or that change this month to their algorithm. Sometimes it's true and other times it's just a rumor. So this week I was speaking with someone in the SEO field who said that this week a change occurred at Google and is going to become more prevalent where content placed at the "top of the fold" on merchant sites with products are going to get better placement, rather than if you have your products at top with some content beneath them at the bottom of the page. Any comments on this?
Algorithm Updates | | applesofgold0