Skip to content
    Moz logo Menu open Menu close
    • Products
      • Moz Pro
      • Moz Pro Home
      • Moz Local
      • Moz Local Home
      • STAT
      • Moz API
      • Moz API Home
      • Compare SEO Products
      • Moz Data
    • Free SEO Tools
      • Domain Analysis
      • Keyword Explorer
      • Link Explorer
      • Competitive Research
      • MozBar
      • More Free SEO Tools
    • Learn SEO
      • Beginner's Guide to SEO
      • SEO Learning Center
      • Moz Academy
      • SEO Q&A
      • Webinars, Whitepapers, & Guides
    • Blog
    • Why Moz
      • Agency Solutions
      • Enterprise Solutions
      • Small Business Solutions
      • Case Studies
      • The Moz Story
      • New Releases
    • Log in
    • Log out
    • Products
      • Moz Pro

        Your all-in-one suite of SEO essentials.

      • Moz Local

        Raise your local SEO visibility with complete local SEO management.

      • STAT

        SERP tracking and analytics for enterprise SEO experts.

      • Moz API

        Power your SEO with our index of over 44 trillion links.

      • Compare SEO Products

        See which Moz SEO solution best meets your business needs.

      • Moz Data

        Power your SEO strategy & AI models with custom data solutions.

      NEW Keyword Suggestions by Topic
      Moz Pro

      NEW Keyword Suggestions by Topic

      Learn more
    • Free SEO Tools
      • Domain Analysis

        Get top competitive SEO metrics like DA, top pages and more.

      • Keyword Explorer

        Find traffic-driving keywords with our 1.25 billion+ keyword index.

      • Link Explorer

        Explore over 40 trillion links for powerful backlink data.

      • Competitive Research

        Uncover valuable insights on your organic search competitors.

      • MozBar

        See top SEO metrics for free as you browse the web.

      • More Free SEO Tools

        Explore all the free SEO tools Moz has to offer.

      What is your Brand Authority?
      Moz

      What is your Brand Authority?

      Check yours now
    • Learn SEO
      • Beginner's Guide to SEO

        The #1 most popular introduction to SEO, trusted by millions.

      • SEO Learning Center

        Broaden your knowledge with SEO resources for all skill levels.

      • On-Demand Webinars

        Learn modern SEO best practices from industry experts.

      • How-To Guides

        Step-by-step guides to search success from the authority on SEO.

      • Moz Academy

        Upskill and get certified with on-demand courses & certifications.

      • SEO Q&A

        Insights & discussions from an SEO community of 500,000+.

      Unlock flexible pricing & new endpoints
      Moz API

      Unlock flexible pricing & new endpoints

      Find your plan
    • Blog
    • Why Moz
      • Small Business Solutions

        Uncover insights to make smarter marketing decisions in less time.

      • Agency Solutions

        Earn & keep valuable clients with unparalleled data & insights.

      • Enterprise Solutions

        Gain a competitive edge in the ever-changing world of search.

      • The Moz Story

        Moz was the first & remains the most trusted SEO company.

      • Case Studies

        Explore how Moz drives ROI with a proven track record of success.

      • New Releases

        Get the scoop on the latest and greatest from Moz.

      Surface actionable competitive intel
      New Feature

      Surface actionable competitive intel

      Learn More
    • Log in
      • Moz Pro
      • Moz Local
      • Moz Local Dashboard
      • Moz API
      • Moz API Dashboard
      • Moz Academy
    • Avatar
      • Moz Home
      • Notifications
      • Account & Billing
      • Manage Users
      • Community Profile
      • My Q&A
      • My Videos
      • Log Out

    The Moz Q&A Forum

    • Forum
    • Questions
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. Home
    2. SEO Tactics
    3. Intermediate & Advanced SEO
    4. Mass Removal Request from Google Index

    Moz Q&A is closed.

    After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

    Mass Removal Request from Google Index

    Intermediate & Advanced SEO
    4
    8
    1846
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with question management privileges can see it.
    • ioannisa
      ioannisa last edited by

      Hi,

      I am trying to cleanse a news website.  When this website was first made, the people that set it up copied all kinds of articles they had as a newspaper, including tests, internal communication, and drafts.  This site has lots of junk, but this kind of junk was on the initial backup, aka before 1st-June-2012.  So, removing all mixed content prior to that date, we can have pure articles starting June 1st, 2012!

      Therefore

      1. My dynamic sitemap now contains only articles with release date between 1st-June-2012 and now
      2. Any article that has release date prior to 1st-June-2012 returns a custom 404 page with "noindex" metatag, instead of the actual content of the article.

      The question is how I can remove from the google index all this junk as fast as possible that is not on the site anymore, but still appears in google results?

      I know that for individual URLs I need to request removal from this link
      https://www.google.com/webmasters/tools/removals

      The problem is doing this in bulk, as there are tens of thousands of URLs I want to remove.  Should I put the articles back to the sitemap so the search engines crawl the sitemap and see all the 404?  I believe this is very wrong.  As far as I know this will cause problems because search engines will try to access non existent content that is declared as existent by the sitemap, and return errors on the webmasters tools.

      Should I submit a DELETED ITEMS SITEMAP using the <expires>tag? I think this is for custom search engines only, and not for the generic google search engine.
      https://developers.google.com/custom-search/docs/indexing#on-demand-indexing</expires>

      The site unfortunatelly doesn't use any kind of "folder" hierarchy in its URLs, but instead the ugly GET params, and a kind of folder based pattern is impossible since all articles (removed junk and actual articles) are of the form:
      http://www.example.com/docid=123456

      So, how can I bulk remove from the google index all the junk... relatively fast?

      1 Reply Last reply Reply Quote 0
      • KristinaKledzik
        KristinaKledzik @ioannisa last edited by

        Hi Ioannis,

        What about the first suggestion? Can you create a page linking to all of the pages that you'd like to remove, then have Google crawl that page?

        Best,

        Kristina

        1 Reply Last reply Reply Quote 0
        • ioannisa
          ioannisa last edited by

          Thank you Kristina,

          I know about the URL structure, I have been trying the past few months to cleanse this site that I was not involved in its creation.  It has several more SEO problems that have either been fixed or not yet, but we are talking about more than 50 SEO problems I've found so far - most of these critical.

          On the sitemap that I built, the junk pages do not exist, and because this is sitemap I have written myself, I can easily make another containing the articles that I have removed (just reverse a part of my select query for the sitemap to get the ones I have removed).

          http://www.neakriti.gr/webservices/sitemap-index.aspx

          So far I implemented the last of your suggestions and here is an example:

          This is a valid article page
          http://www.neakriti.gr/?page=newsdetail&DocID=1314221 - (Status Code: 200)

          This is a non existent article page (never existed at the first place) - (Status Code: 404)
          http://www.neakriti.gr/?page=newsdetail&DocID=12345678

          This is one of the articles that I removed from sitemap and site - (Status Code: 410)
          http://www.neakriti.gr/?page=newsdetail&DocID=894052

          Also I would like you to take a look at another question about the same site and see that it can relate to this question with garbage articles too...
          https://moz.com/community/q/multiple-instances-of-the-same-article

          Thank you so much!

          KristinaKledzik 1 Reply Last reply Reply Quote 0
          • KristinaKledzik
            KristinaKledzik last edited by

            Hi Ioannis,

            You're in quite a bind here, without a good URL structure! I don't think there's any one perfect option, but I think all of these will work:

            • Create a page on your site that links to every article you would like to delete, keeping those articles 404/410ed. Then, use the Fetch as Googlebot tool, and ask Google to crawl the page plus all of its links. This will get Google to quickly crawl all of those pages, see that they're gone, and remove them from their index. Keep in mind that if you just use a 404, Google may keep the page around for a bit to make sure you didn't just mess up. As Eric said, a 410 is more of a sure thing.
            • Create an XML sitemap of those deleted articles, and have Google crawl it. Yes, this will create errors in GSC, but errors in GSC mean that they're concerned you've made a mistake, not that they're necessarily penalizing you. Just mark those guys as fixed and take the sitemap down once Google's crawled it.
            • 410 these pages, remove all internal links to them (use a tool like Screaming Frog to make sure you didn't miss any links!), and remove them from your sitemap. That'll distance you from that old, crappy content, and Google will slowly realize that it's been removed as it checks in on its old pages. This is probably the least satisfying option, but it's an option that'll get the job done eventually.

            Hope this helps! Let us know what you decide to do.

            Best,

            Kristina

            1 Reply Last reply Reply Quote 1
            • ioannisa
              ioannisa last edited by

              Thank you,

              so you suggest that based on my date based query, instead of blocking everything before that date blindly, keep blocking it with 410, while anything that doesn't exist anyway return 404.

              Also another question, about the blocked articles that return 410, should I put their URLs back on the xml sitemap or not?

              1 Reply Last reply Reply Quote 0
              • GlobeRunner
                GlobeRunner last edited by

                Any article that has release date prior to 1st-June-2012 should return a custom 410 page with "noindex" metatag, instead of the actual content of the article.

                The error returned should be a "410 gone" and not just a 404. That way Google will treat it differently, and may remove it from the index faster than just returning a 404. Also, you can use the Google removal tool, as well. Don't forget the robots.txt file, as well, there may be directories with the content that you need to disallow.

                But overall, using a 410 is going to be better and most likely faster.

                1 Reply Last reply Reply Quote 2
                • ioannisa
                  ioannisa last edited by

                  Thank you for your response.

                  I defenintelly cannot use noindex because as I explained I changed all articles prior to the minimum given date to return 404.  So this content is not visibly available on the web in order to contain a noindex directive.  Unless you mean to have it at my custom 404 page, where yes its there.

                  Also there is no folder to associate in robots, since they are in ugly form of GET params like DOCID=12345.  So given that, there are thousands of DocIDs that are junk and removed, and thousands that are the actuall articles.

                  So I assumed that creating a "deleted articles" sitemap where each <url>will contain an <expires>2016-06-01</expires> tag seemed the most logical thing, but I am afraid its for "custom search engines", rather than for normal de-index requests as its provided bellow</url>

                  https://developers.google.com/custom-search/docs/indexing#on-demand-indexing

                  1 Reply Last reply Reply Quote 0
                  • Martijn_Scheijbeler
                    Martijn_Scheijbeler last edited by

                    Sitemaps is definitely not the way to go for this as you can't just have an expires tag in there and it would make pages go away. The best option to go with is the meta robots and then put them either on nonindex, nofollow, or noindex, follow. With this approach and hopefully with a relative high crawl rate you can make sure that the data from these pages will be removed from the Google Index as soon as possible.

                    If you still want these pages to be indexed but maybe just not have them crawled anymore, which I don't think you'd like to do based on your explanation then go with robots.txt and excluding the pages in there that you'd like to.

                    1 Reply Last reply Reply Quote 2
                    • 1 / 1
                    • First post
                      Last post

                    Got a burning SEO question?

                    Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


                    Start my free trial


                    Browse Questions

                    Explore more categories

                    • Moz Tools

                      Chat with the community about the Moz tools.

                    • SEO Tactics

                      Discuss the SEO process with fellow marketers

                    • Community

                      Discuss industry events, jobs, and news!

                    • Digital Marketing

                      Chat about tactics outside of SEO

                    • Research & Trends

                      Dive into research and trends in the search industry.

                    • Support

                      Connect on product support and feature requests.

                    • See all categories

                    Related Questions

                    • moon-boots

                      Google Pagination Changes

                      What with Google recently coming out and saying they're basically ignoring paginated pages, I'm considering the link structure of our new, sooner to launch ecommerce site (moving from an old site to a new one with identical URL structure less a few 404s). Currently our new site shows 20 products per page but with this change by Google it means that any products on pages 2, 3 and so on will suffer because google treats it like an entirely separate page as opposed to an extension of the first. The way I see it I have one option: Show every product in each category on page 1. I have Lazy Load installed on our new website so it will only load the screen a user can see and as they scroll down it loads more products, but how will google interpret this? Will Google simply see all 50-300 products per category and give the site a bad page load score because it doesn't know the Lazy Load is in place? Or will it know and account for it? Is there anything I'm missing?

                      Intermediate & Advanced SEO | | moon-boots
                      0
                    • Tylerj

                      Should I use noindex or robots to remove pages from the Google index?

                      I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed. From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else. I can add the noindex tag to the review pages but they wont be crawled. https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?

                      Intermediate & Advanced SEO | | Tylerj
                      0
                    • NeatIT

                      6 .htaccess Rewrites: Remove index.html, Remove .html, Force non-www, Force Trailing Slash

                      i've to give some information about my website Environment 1. i have static webpage in the root. 2. Wordpress installed in sub-dictionary www.domain.com/blog/ 3. I have two .htaccess , one in the root and one in the wordpress
                      folder. i want to www to non on all URLs Remove index.html from url Remove all .html extension / Re-direct 301 to url
                      without .html extension Add trailing slash to the static webpages / Re-direct 301 from non-trailing slash Force trailing slash to the Wordpress Webpages / Re-direct 301 from non-trailing slash Some examples domain.tld/index.html >> domain.tld/ domain.tld/file.html >> domain.tld/file/ domain.tld/file.html/ >> domain.tld/file/ domain.tld/wordpress/post-name >> domain.tld/wordpress/post-name/ My code in ROOT htaccess is <ifmodule mod_rewrite.c="">Options +FollowSymLinks -MultiViews RewriteEngine On
                      RewriteBase / #removing trailing slash
                      RewriteCond %{REQUEST_FILENAME} !-d
                      RewriteRule ^(.*)/$ $1 [R=301,L] #www to non
                      RewriteCond %{HTTP_HOST} ^www.(([a-z0-9_]+.)?domain.com)$ [NC]
                      RewriteRule .? http://%1%{REQUEST_URI} [R=301,L] #html
                      RewriteCond %{REQUEST_FILENAME} !-f
                      RewriteCond %{REQUEST_FILENAME} !-d
                      RewriteRule ^([^.]+)$ $1.html [NC,L] #index redirect
                      RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.html\ HTTP/
                      RewriteRule ^index.html$ http://domain.com/ [R=301,L]
                      RewriteCond %{THE_REQUEST} .html
                      RewriteRule ^(.*).html$ /$1 [R=301,L]</ifmodule> The above code do 1. redirect www to non-www
                      2. Remove trailing slash at the end (if exists)
                      3. Remove index.html
                      4. Remove all .html
                      5. Redirect 301 to filename but doesn't add trailing slash at the end

                      Intermediate & Advanced SEO | | NeatIT
                      0
                    • vivekrathore

                      Google indexing pages from chrome history ?

                      We have pages that are not linked from site yet they are indexed in Google. It could be possible if Google got these pages from browser. Does Google takes data from chrome?

                      Intermediate & Advanced SEO | | vivekrathore
                      0
                    • abhihan

                      Google indexing only 1 page out of 2 similar pages made for different cities

                      We have created two category pages, in which we are showing products which could be delivered in separate cities. Both pages are related to cake delivery in that city. But out of these two category pages only 1 got indexed in google and other has not. Its been around 1 month but still only Bangalore category page got indexed. We have submitted sitemap and google is not giving any crawl error. We have also submitted for indexing from "Fetch as google" option in webmasters. www.winni.in/c/4/cakes (Indexed - Bangalore page - http://www.winni.in/sitemap/sitemap_blr_cakes.xml) 2. http://www.winni.in/hyderabad/cakes/c/4 (Not indexed - Hyderabad page - http://www.winni.in/sitemap/sitemap_hyd_cakes.xml) I tried searching for "hyderabad site:www.winni.in" in google but there also http://www.winni.in/hyderabad/cakes/c/4 this link is not coming, instead of this only www.winni.in/c/4/cakes is coming. Can anyone please let me know what could be the possible issue with this?

                      Intermediate & Advanced SEO | | abhihan
                      0
                    • BeTheBoss

                      Removing Dynamic "noindex" URL's from Index

                      6 months ago my clients site was overhauled and the user generated searches had an index tag on them. I switched that to noindex but didn't get it fast enough to avoid being 100's of pages indexed in Google. It's been months since switching to the noindex tag and the pages are still indexed. What would you recommend? Google crawls my site daily - but never the pages that I want removed from the index. I am trying to avoid submitting hundreds of these dynamic URL's to the removal tool in webmaster tools. Suggestions?

                      Intermediate & Advanced SEO | | BeTheBoss
                      0
                    • seoppc2012

                      Does Google index url with hashtags?

                      We are setting up some Jquery tabs in a page that will produce the same url with hashtags. For example: index.php#aboutus, index.php#ourguarantee, etc. We don't want that content to be crawled as we'd like to prevent duplicate content. Does Google normally crawl such urls or does it just ignore them? Thanks in advance.

                      Intermediate & Advanced SEO | | seoppc2012
                      0
                    • nicole.healthline

                      Best way to de-index content from Google and not Bing?

                      We have a large quantity of URLs that we would like to de-index from Google (we are affected b Panda), but not Bing. What is the best way to go about doing this?

                      Intermediate & Advanced SEO | | nicole.healthline
                      0

                    Get started with Moz Pro!

                    Unlock the power of advanced SEO tools and data-driven insights.

                    Start my free trial
                    Products
                    • Moz Pro
                    • Moz Local
                    • Moz API
                    • Moz Data
                    • STAT
                    • Product Updates
                    Moz Solutions
                    • SMB Solutions
                    • Agency Solutions
                    • Enterprise Solutions
                    Free SEO Tools
                    • Domain Authority Checker
                    • Link Explorer
                    • Keyword Explorer
                    • Competitive Research
                    • Brand Authority Checker
                    • MozBar Extension
                    • MozCast
                    Resources
                    • Blog
                    • SEO Learning Center
                    • Help Hub
                    • Beginner's Guide to SEO
                    • How-to Guides
                    • Moz Academy
                    • API Docs
                    About Moz
                    • About
                    • Team
                    • Careers
                    • Contact
                    Why Moz
                    • Case Studies
                    • Testimonials
                    Get Involved
                    • Become an Affiliate
                    • MozCon
                    • Webinars
                    • Practical Marketer Series
                    • MozPod
                    Connect with us

                    Contact the Help team

                    Join our newsletter
                    Moz logo
                    © 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                    • Accessibility
                    • Terms of Use
                    • Privacy

                    Looks like your connection to Moz was lost, please wait while we try to reconnect.