Quickest way to deindex a large number of pages
-
Our site was recently hacked by spammers posting fake content and bringing down our servers, etc. After a few months, we finally figured out what was going on and fixed the issue. However, it turns out that Google has indexed 26K+ spammy pages and we've lost page rank and search engine rankings as a result.
What is the best and fastest way to get these pages out of Google's index?
-
Given that I'm sure you've removed these pages from your site, there will be no page to which to add a meta-noindex tag.
Disallowing these pages in robots.txt in no way signals to the search engines that they should be removed from the index, just that they should no longer be crawled. Given that they're already indexed, blocking in robots.txt would potentially save some "crawl budget" but wouldn't do anything to remove them from the index.
So submitting them to the URL Removal Tool would be by far the most effective, along with an explanation.
You'll also want to keep a very close watch on your penalty warnings within Webmaster Tools. If you get flagged, you'll want a complete history of the issue and the steps you've taken to address it in order to prepare a reinclusion request.
Lastly, don't forget to submit these same URLs to the Bing Webmaster Tools Block URLs tool. You may not get a massive amount of traffic from Bing, but there's no sense throwing it away, since you've already prepared the URL removal list anyway.
Hope that helps?
Paul
-
Yup. Just wanted to add as well that if these pages are in a particular directory, then you can deindex the entire directory in one command using the URL removal tool.
-
Disallow in robots.txt
Add a noindex meta tag to these pages
Request Google to remove the URLs from their index via WMT URL removal request
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it Okay to have "No Response" pages?
Hi all, I can see some "No Response" pages which gives a error message "Site cannot be reached" or keeps on loading but don't. I have got this list from Screaming from spider tool. Do we need to fix these or ignore? Thanks
Algorithm Updates | | vtmoz0 -
The evolution of Google's 'Quality' filters - Do thin product pages still need noindex?
I'm hoping that Mozzers can weigh in with any recent experiences with eCommerce SEO..... I like to assume (perhaps incorrectly) that Google's 'Quality' filters (formerly known as Panda) have evolved with some intelligence since Panda first launched and started penalising eCommerce sites for having thin product pages. On this basis i'd expect that the filters are now less heavy handed and know that product pages with no or little product description on them are still a quality user experience for people who want to buy that product. Therefore my question is this...
Algorithm Updates | | QubaSEO
Do thin product pages still need noindex given that more often that not they are a quality search result for those using a product specific search query? Has anyone experienced penalty recently (last 12 months) on an ecommerce site because of a high number of thin product pages?0 -
How long for google to de-index old pages on my site?
I launched my redesigned website 4 days ago. I submitted a new site map, as well as submitted it to index in search console (google webmasters). I see that when I google my site, My new open graph settings are coming up correct. Still, a lot of my old site pages are definitely still indexed within google. How long will it take for google to drop off or "de-index" my old pages? Due to the way I restructured my website, a lot of the items are no longer available on my site. This is on purpose. I'm a graphic designer, and with the new change, I removed many old portfolio items, as well as any references to web design since I will no longer offering that service. My site is the following:
Algorithm Updates | | rubennunez
http://studio35design.com0 -
US domain pages showing up in Google UK SERP
Hi, Our website which was predominantly for UK market was setup with a .com extension and only two years ago other domains were added - US (.us) , IE (.ie), EU (.eu) & AU (.com.au) Last year in July, we noticed that few .us domain urls were showing up in UK SERPs and we realized the sitemap for .us site was incorrectly referring to UK (.com) so we corrected that and the .us domain urls stopped appearing in the SERP. Not sure if this actually fixed the issue or was such coincidental. However in last couple of weeks more than 3 .us domain urls are showing for each brand search made on Google UK and sometimes it replaces the .com results all together. I have double checked the PA for US pages, they are far below the UK ones. Has anyone noticed similar behaviour &/or could anyone please help me troubleshoot this issue? Thanks in advance, R
Algorithm Updates | | RaksG0 -
Bing not indexing pages
We have taken all recommended steps to index our site sitegeek.com pages to Bing Bot but failed to index them. Bing bot crawled more than 5,000 pages every day but strange why pages are not getting index ? if we query site:sitegeek.com in Bing Bing Search Engine shows only 1,200 pages got indexed. but we query site:sitegeek.com in Google Google Search Engine show more 546,000 pages got indexed. For example : https://www.sitegeek.com/000webhost Above page crawled by Google but Bing. Can anyone suggest what we are missing on this page? what need to change to index such pages? Thanks! Rajiv
Algorithm Updates | | gamesecure0 -
Lots of dublicate titles and pages on search page
I own a paiting website with a lot of searchable paintings. The "search paintings" feature creates tons of dublicate pages and titles. See here:
Algorithm Updates | | KasperGJ
http://www.maleribasen.dk/soegmaleri.asp I guess the problem is, that the URL can actually be different and still return the same content. First time you click the "Search paintings" the URL will shown as above. But as soon as users
begin to definere they search to the left and use the "Search button" the top URL changes. So, depending on how the top URL looks different results are shown. This is pretty standard in searches. But it returns tons of dublicate pages and titles. How, do you guys cope with that? Is there a clever way to use ref="cannonical" or some other smart way to avoid this? /Kasper0 -
Numbers vs #'s For Blog Titles
For your blog post titles, is it "better" to use numbers or write them out? For example, 3 Things I love About People Answering My Constant Questions or Three Things I Love About People Answering My Constant Questions? I could see this being like the attorney/lawyer, ecommerce/e-commerce and therefore not a big deal. But, I also thought you should avoid using #'s in your url's. Any thoughts, Ruben
Algorithm Updates | | KempRugeLawGroup0 -
Sudden Page Rank Drop for Weight Loss Camp
My site BalanceME.com has been performing really well and continually climbs more and more but in the last 7 days we have had a drop this last week. We went from #2 for weight loss camp to #6, and even further on other terms. It is mid May and I cannot find any other explanation for the sudden drop. I am hoping someone could give me some sort if idea what could possibly be the answer. Recently we have had two press releases go out, target for keywords, and also installed images that navigate to new pages on the site. There have been few additions to the site, and nothing can explain the drop. Other companies in our group have not had issues with recent drops, and we use the same practices with the other organizations. Thank you for your time
Algorithm Updates | | FVdBeuken0