Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best practice for deindexing large quantities of pages
-
We are trying to deindex a large quantity of pages on our site and want to know what the best practice for doing that is. For reference, the reason we are looking for methods that could help us speed it up is we have about 500,000 URLs that we want deindexed because of mis-formatted HTML code and google indexed them much faster than it is taking to unindex them unfortunately.
We don't want to risk clogging up our limited crawl log/budget by submitting a sitemap of URLs that have "noindex" on them as a hack for deindexing. Although theoretically that should work, we are looking for white hat methods that are faster than "being patient and waiting it out", since that would likely take months if not years with Google's current crawl rate of our site.
-
Unfortunately, I don't think there's any easy/fast way to do this. I just ran a test to see how long it take Google to actually obey a noindex tag, and it's taken a little over 2 months for them all to be removed. I had 2 WP blogs that I added the noindex tag to all category, tag, and author pages and monitored the index count 4 or 5 times per week by running site:example.com inurl:/category/ queries. There was a lot of fluctuation at the beginnning, but eventually took hold after about 2 months. On one of the sites, I did add an XML sitemap with only the noindexed URLs on it, submitted it via Search Console, but that didn't seem to have an impact on how quickly they were dropped out.
See the screenshot below of my plotting of indexed pages per subfolder:
-
Hey,
you might be interested in this thread for getting your question answered.
https://moz.com/community/q/quickest-way-to-deindex-a-large-number-of-pages
Hope it helps. Cheers, Martin
-
Hi,
I have never tested the method that I'm sharing here. Please check once it might be helpful in your case.
https://www.searchcommander.com/how-to-bulk-remove-urls-google/
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is single H1 tag still best practice?
Hi Guys, Is having a single h1 tag still best practice for SEO? Guessing multiple h1 tags dilute the value of the tag and keywords within the tag. Thoughts? Cheers.
Intermediate & Advanced SEO | | kayl870 -
Best way to "Prune" bad content from large sites?
I am in process of pruning my sites for low quality/thin content. The issue is that I have multiple sites with 40k + pages and need a more efficient way of finding the low quality content than looking at each page individually. Is there an ideal way to find the pages that are worth no indexing that will speed up the process but not potentially harm any valuable pages? Current plan of action is to pull data from analytics and if the url hasn't brought any traffic in the last 12 months then it is safe to assume it is a page that is not beneficial to the site. My concern is that some of these pages might have links pointing to them and I want to make sure we don't lose that link juice. But, assuming we just no index the pages we should still have the authority pass along...and in theory, the pages that haven't brought any traffic to the site in a year probably don't have much authority to begin with. Recommendations on best way to prune content on sites with hundreds of thousands of pages efficiently? Also, is there a benefit to no indexing the pages vs deleting them? What is the preferred method, and why?
Intermediate & Advanced SEO | | atomiconline0 -
What are the best practices for geo-targeting by sub-folders?
My domain is currently targeting the US, but I'm building out sub-folders that will need to geo-target France, England, and Spain. Each country will have it's own sub-folder, and professionally translated (domain.com/france). Other than the hreflang tags, what are other best practices I can implement? Can Google Webmaster tools geo-target by subfolder? Any suggestions would be appreciated. Thanks Justin
Intermediate & Advanced SEO | | Rhythm_Agency0 -
What's the best way to redirect categories & paginated pages on a blog?
I'm currently re-doing my blog and have a few categories that I'm getting rid of for housecleaning purposes and crawl efficiency. Each of these categories has many pages (some have hundreds). The new blog will also not have new relevant categories to redirect them to (1 or 2 may work). So what is the best place to properly redirect these pages to? And how do I handle the paginated URLs? The only logical place I can think of would be to redirect them to the homepage of the blog, but since there are so many pages, I don't know if that's the best idea. Does anybody have any thoughts?
Intermediate & Advanced SEO | | kking41200 -
E-commerce site, one product multiple categories best practice
Hi there, We have an e-commerce shopping site with over 8000 products and over 100 categories. Some sub categories belong to multiple categories - for example, A Christmas trees can be under "Gardening > Plants > Trees" and under "Gifts > Holidays > Christmas > Trees" The product itself (example: Scandinavian Xmas Tree) can naturally belong to both these categories as well. Naturally these two (or more) categories have different breadcrumbs, different navigation bars, etc. From an SEO point of view, to avoid duplicate content issues, I see the following options: Use the same URL and change the content of the page (breadcrumbs and menus) based on the referral path. Kind of cloaking. Use the same URL and display only one "main" version of breadcrumbs and menus. Possibly add the other "not main" categories as links to the category / product page. Use a different URL based on where we came from and do nothing (will create essentially the same content on different urls except breadcrumbs and menus - there's a possibiliy to change the category text and page title as well) Use a different URL based on where we came from with different menus and breadcrumbs and use rel=canonical that points to the "main" category / product pages This is a very interesting issue and I would love to hear what you guys think as we are finalizing plans for a new website and would like to get the most out of it. Thank you all!
Intermediate & Advanced SEO | | arikbar0 -
Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search
Hi! I have pages within my forum where visitors can upload photos. When they upload photos they provide a simple statement about the photo but no real information about the image,definitely not enough for the page to be deemed worthy of being indexed. The industry however is one that really leans on images and having the images in Google Image search is important to us. The url structure is like such: domain.com/community/photos/~username~/picture111111.aspx I wish to block the whole folder from Googlebot to prevent these low quality pages from being added to Google's main SERP results. This would be something like this: User-agent: googlebot Disallow: /community/photos/ Can I disallow Googlebot specifically rather than just using User-agent: * which would then allow googlebot-image to pick up the photos? I plan on configuring a way to add meaningful alt attributes and image names to assist in visibility, but the actual act of blocking the pages and getting the images picked up... Is this possible? Thanks! Leona
Intermediate & Advanced SEO | | HD_Leona0 -
Are there any negative effects to using a 301 redirect from a page to another internal page?
For example, from http://www.dog.com/toys to http://www.dog.com/chew-toys. In my situation, the main purpose of the 301 redirect is to replace the page with a new internal page that has a better optimized URL. This will be executed across multiple pages (about 20). None of these pages hold any search rankings but do carry a decent amount of page authority.
Intermediate & Advanced SEO | | Visually0 -
301 - should I redirect entire domain or page for page?
Hi, We recently enabled a 301 on our domain from our old website to our new website. On the advice of fellow mozzer's we copied the old site exactly to the new domain, then did the 301 so that the sites are identical. Question is, should we be doing the 301 as a whole domain redirect, i.e. www.oldsite.com is now > www.newsite.com, or individually setting each page, i.e. www.oldsite.com/page1 is now www.newsite.com/page1 etc for each page in our site? Remembering that both old and new sites (for now) are identical copies. Also we set the 301 about 5 days ago and have verified its working but haven't seen a single change in rank either from the old site or new - is this because Google hasn't likely re-indexed yet? Thanks, Anthony
Intermediate & Advanced SEO | | Grenadi0