Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How do I find which pages are being deindexed on a large site?
-
Is there an easy way or any way to get a list of all deindexed pages?
Thanks for reading!
-
Hi Daniel
Yep - as Mat says there's no official solution to this. Do you mean deindexed by Google (without you wanting them to be) or deindexed by you on purpose?
I suppose you could also;
- crawl your whole site
- depending how big the site is, do a site: search in Google.
- use the SERPs redux bookmarklet - get all indexed URLs in a column in a spreadsheet
- compare your crawl vs. the list indexed and whichever was not present in the SERPs could have been deindexed
- this method is faulty as it assumes all crawled URLs were indexed in the first place - but could get you part of the way there.
-Dan
-
If you have a full list of URLs you could check for cache date on each at Google. Unless you were doing that manually it would be technically against google TOS, but so is SERP checking. More to the point I don't think it would be foolproof as indexed pages will sometimes return no cache date.
It's a bit of a convoluted method, but I think that might be your only option.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does anyone know the linking of hashtags on Wix sites does it negatively or postively impact SEO. It is coming up as an error in site crawls 'Pages with 404 errors' Anyone got any experience please?
Does anyone know the linking of hashtags on Wix sites does it negatively or positively impact SEO. It is coming up as an error in site crawls 'Pages with 404 errors' Anyone got any experience please? For example at the bottom of this blog post https://www.poppyandperle.com/post/face-painting-a-global-language the hashtags are linked, but they don't go to a page, they go to search results of all other blogs using that hashtag. Seems a bit of a strange approach to me.
Technical SEO | | Mediaholix0 -
How to find orphan pages
Hi all, I've been checking these forums for an answer on how to find orphaned pages on my site and I can see a lot of people are saying that I should cross check the my XML sitemap against a Screaming Frog crawl of my site. However, the sitemap is created using Screaming Frog in the first place... (I'm sure this is the case for a lot of people too). Are there any other ways to get a full list of orphaned pages? I assume it would be a developer request but where can I ask them to look / extract? Thanks!
Technical SEO | | KJH-HAC1 -
Help Setting Up 301 Redirects from Coldfusion Site to Wordpress Site.
I have created a new website and need to redirect all of the previous pages to the new one. The old website was built in coldfusion and the new site is built in wordpress. One of the pages I'm trying to redirect is www.norriseal.com/products.cfm to http://norrisealwellmark.com/products/. This is what I have in my .htaccess file <ifmodule mod_rewrite.c="">Options +FollowSymlinks
Technical SEO | | MarketHubb
RewriteEngine On
RewriteBase /
Redirect 301 /products.cfm http://norrisealwellmark.com/products/</ifmodule> The result of this redirect is http://norrisealwellmark.com/products.cfm How do I prevent the .cfm from appending to the destination URL?1 -
Bingbot appears to be crawling a large site extremely frequently?
Hi All! What constitutes a normal crawl rate for daily bingbot server requests for large sites? Are any of you noticing spikes in Bingbot crawl activity? I did find a "mildly" useful thread at Black Hat World containing this quote: "The reason BingBot seems to be terrorizing your site is because of your site's architecture; it has to be misaligned. If you are like most people, you paid no attention to setting up your website to avoid this glitch. In the article referenced by Oxonbeef, the author's issue was that he was engaging in dynamic linking, which pretty much put the BingBot in a constant loop. You may have the same type or similar issue particularly if you set up a WP blog without setting the parameters for noindex from the get go." However, my gut instinct says this isn't it and that it's more likely that someone or something is spoofing bingbot. I'd love to hear what you guys think! Dana
Technical SEO | | danatanseo1 -
Are image pages considered 'thin' content pages?
I am currently doing a site audit. The total number of pages on the website are around 400... 187 of them are image pages and coming up as 'zero' word count in Screaming Frog report. I needed to know if they will be considered 'thin' content by search engines? Should I include them as an issue? An answer would be most appreciated.
Technical SEO | | MTalhaImtiaz0 -
Staging site and "live" site have both been indexed by Google
While creating a site we forgot to password protect the staging site while it was being built. Now that the site has been moved to the new domain, it has come to my attention that both the staging site (site.staging.com) and the "live" site (site.com) are both being indexed. What is the best way to solve this problem? I was thinking about adding a 301 redirect from the staging site to the live site via HTACCESS. Any recommendations?
Technical SEO | | melen0 -
Is the Authority of Individual Pages Diluted When You Add New Pages?
I was wondering if the authority of individual pages is diluted when you add new pages (in Google's view). Suppose your site had 100 pages and you added 100 new pages (without getting any new links). Would the average authority of the original pages significantly decrease and result in a drop in search traffic to the original pages? Do you worry that adding more pages will hurt pages that were previously published?
Technical SEO | | Charlessipe0 -
How long does it take for Google for deindexing pages?
Hi mozzers, We just launched a mobile website(parallel) and realized that it created many duplicate content with desktop URLs. I decided to add name="robots" content="No index, No follow" /> to the entire mobile site. My only concern is that I am still seeing the mobile site indexed when it's been almost a week I added these tags. Does anyone know how long it takes google to deindex your content? Thanks
Technical SEO | | Ideas-Money-Art0