Quickest way to deindex a large number of pages
-
Our site was recently hacked by spammers posting fake content and bringing down our servers, etc. After a few months, we finally figured out what was going on and fixed the issue. However, it turns out that Google has indexed 26K+ spammy pages and we've lost page rank and search engine rankings as a result.
What is the best and fastest way to get these pages out of Google's index?
-
Given that I'm sure you've removed these pages from your site, there will be no page to which to add a meta-noindex tag.
Disallowing these pages in robots.txt in no way signals to the search engines that they should be removed from the index, just that they should no longer be crawled. Given that they're already indexed, blocking in robots.txt would potentially save some "crawl budget" but wouldn't do anything to remove them from the index.
So submitting them to the URL Removal Tool would be by far the most effective, along with an explanation.
You'll also want to keep a very close watch on your penalty warnings within Webmaster Tools. If you get flagged, you'll want a complete history of the issue and the steps you've taken to address it in order to prepare a reinclusion request.
Lastly, don't forget to submit these same URLs to the Bing Webmaster Tools Block URLs tool. You may not get a massive amount of traffic from Bing, but there's no sense throwing it away, since you've already prepared the URL removal list anyway.
Hope that helps?
Paul
-
Yup. Just wanted to add as well that if these pages are in a particular directory, then you can deindex the entire directory in one command using the URL removal tool.
-
Disallow in robots.txt
Add a noindex meta tag to these pages
Request Google to remove the URLs from their index via WMT URL removal request
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does Google considers the direct traffic on the pages with rel canonical tags?
Hi community, Let's say there is a duplicate page (A) pointing to original page (B) using rel canonical tag. Pagerank will be passed from Page A to B as the content is very similar and Google honours it hopefully. I wonder how Google treats the direct traffic on the duplicate Page A. We know that direct traffic is also an important ranking factor (correct me if I'm wrong). If the direct traffic is high on the duplicate page A, then how Google considers it? Will there be any score given to original page B? Thanks
Algorithm Updates | | vtmoz0 -
Does Google giving more important to internal pages than homepage recently? Especially after the recent Major algo update?
Hi everybody, I can see the change Google brought in the SERP. Previously website homepages will be shown for primary keywords, now it's slowly and almost switched to showing most related internal pages in a website. You can check same for keyword "SEO", Most or all the results are internal pages. I can see this change for our primary keyword from last one month. So basically Google is trying to show a page explaining about the primary keywords rather than website, that's how "what is seo" pages are ranking than homepages. If there is no such pages existed or not well written, Google is just showing the website homepage. But I noticed that websites ranking with homepages are dropped compared to the websites with dedicated page about that primary keyword. Please share your thoughts. Thanks
Algorithm Updates | | vtmoz0 -
Why do we have so many pages scanned by bots (over 250,000) and our biggest competitors have about 70,000? Seems like something is very wrong.
We are trying to figure out why last year we had a huge (80%) and sudden (within two days) drop in our google searches. The only "outlier" in our site that we can find is a huge number of pages reported in MOZ as scanned by search engines. Is this a problem? How did we get so many pages reported? What can we do to bring the number of searched pages back to a "normal" level? BT
Algorithm Updates | | achituv0 -
Having issues claiming a Google+ Business page (phone number not associated with business address)
When attempting to claim my Google+ account, it asks for the phone number. When I enter the number listed on my business listing, it says that number cannot be found... It then tells me to re-enter all my business info. If I do this, will I lose all my existing photos, videos etc.? Has anyone found this?
Algorithm Updates | | DCochrane0 -
Any ideas why our category pages got de-indexed?
Hi all, I work for evenues, a directory website that provides listings of meeting rooms and event spaces. Things seemed to be chugging along nicely with our link building effort (mostly through guest blogging using a variety of anchor text). Woke up on Monday morning to find that our City pages have been de-indexed. This page: http://www.evenues.com/Meeting-Spaces/Seattle/Washington used to be at the top of page #2 in the SERPs for the keyword "Meeting Rooms in Seattle" I doubt that we got de-indexed because of our link building efforts, as it was only a few blog posts and links from profile pages on community websites. My guess is that when we did a recent 2.0 release of the site, there are now several "filters" or subcategory pages with latitude and longitude parameters in the URL + different page titles based on the categories like: "Meeting Rooms and Event Spaces in Seattle" --Main Page "Meeting Rooms in Seattle" "Classroom Venues in Seattle" "Party Venues in Seattle" There was a bit of pushback when I suggested that we do a rel="canonical" on these babies because ideally we'd like to rank for all 4 queries (Meeting Rooms, Party Venues, Classrooms, in City). These are new changes, and I have a sneaking suspicion this is why we got de-indexed. We're presenting generally the same content. Thoughts?
Algorithm Updates | | eVenuesSEO0 -
Why is a link considered active, but is no longer on the page?
How come links sometimes show up in OSE or Yahoo Site Explorer and then when you go to the page, they're not there anymore? Why is a link indexed or considered active but is no longer on the page?
Algorithm Updates | | MichaelWeisbaum0 -
Google removing pages from Index for Panda effected sites?
We have several clients that we took over from other SEO firms in the last 6 months. We are seeing an odd trend. Links are disappearing from the reports. Not just the SEOmoz reports, but all the back link reports we use. Also... sites that pre Panda would show up as a citation or link, have not been showing up. Many are these are not Indexed, and are on large common Y.P or other type sites. Any one think Google is removing pages from the Index on sites based on Panda. Yours in all curiosity. PS ( we are not large enough to produce quantity data on this.)
Algorithm Updates | | MBayes0 -
Today all of our internal pages all but completely disappeared from google search results. Many of them, which had been optimized for specific keywords, had high rankings. Did google change something?
We had optimized internal pages, targeting specific geographic markets. The pages used the keywords in the url title, the h1 tag, and within the content. They scored well using the SEOmoz tool and were increasing in rank every week. Then all of a sudden today, they disappeared. We had added a few links from textlink.com to test them out, but that's about the only change we made. The pages had a dynamic url, "?page=" that we were about to redirect to a static url but hadn't done it yet. The static url was redirecting to the dynamic url. Does anyone have any idea what happened? Thanks!
Algorithm Updates | | h3counsel0