Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How do I find which pages are being deindexed on a large site?
-
Is there an easy way or any way to get a list of all deindexed pages?
Thanks for reading!
-
Hi Daniel
Yep - as Mat says there's no official solution to this. Do you mean deindexed by Google (without you wanting them to be) or deindexed by you on purpose?
I suppose you could also;
- crawl your whole site
- depending how big the site is, do a site: search in Google.
- use the SERPs redux bookmarklet - get all indexed URLs in a column in a spreadsheet
- compare your crawl vs. the list indexed and whichever was not present in the SERPs could have been deindexed
- this method is faulty as it assumes all crawled URLs were indexed in the first place - but could get you part of the way there.
-Dan
-
If you have a full list of URLs you could check for cache date on each at Google. Unless you were doing that manually it would be technically against google TOS, but so is SERP checking. More to the point I don't think it would be foolproof as indexed pages will sometimes return no cache date.
It's a bit of a convoluted method, but I think that might be your only option.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why does a site that is worse than mine by every objective measure I can find, keep outranking me in search?
I’ve been working on educating myself about SEO all day, again. All-Star Telescope up in Canada. We have a competitor that consistently ranks #1 and I don't get it. Their site is full of duplicate content (straight copy and paste from the manufacturer site). They don't have any meaningful blog or video content to add relevance or value to their site. We have higher page authority, higher domain authority, and they keyword analyzer in moz says that our page is higher quality than the the competitors page. Our site is slow, but theirs is slower. I can’t find a single metric on any tool (ubbersuggest, Moz, ahrefs, semrush) that says Telescopes Canada is a better site, or has a better NexStar 8SE product page (a popular telescope). Here’s the link to Telescope Canada’s page for their Celestron 8SE: https://telescopescanada.ca/products/celestron-nexstar-8se-computerized-telescope-11069?_pos=1&_sid=f0aa91cc2&_ss=r Here’s a link to the Celestron 8SE page from the manufacturer website: https://www.celestron.com/products/nexstar-8se-computerized-telescope?_pos=1&_sid=56abdabd4&_ss=r#description Telescopes Canada has just copied and pasted. There is no original content aside from adding the shipping and return policy to the tab, and having some options for selecting accessories on the page. Here is our page: https://all-startelescope.com/products/celestron-nexstar-8se Our titles are good, our metadata is good (but I don’t think that’s been a serious ranking factor for about ten years). The text is original, it’s relevant, we have healthy internal links to the page. We have invensted in some excellent blog content, we’re adding new products to the website so that we rank for more keywords. All of those things are helping, but I fundamentally don’t understand why Telescopes Canada is #1 almost across the board on every key product in our market. There is something that I’m not seeing here, something that isn't being captured by the tools that I have. Is it simple the fact that they get more traffic? Is that why some people go and buy traffic? Can you see any metric, any tool in your toolbox that indicates why they rank at the top, or even higher than we do for in these search terms specific to that product: Celestron NexStar 8SE
Technical SEO | | nkennett
NexStar 8SE
Celestron NexStar 8SE Canada
NexStar 8SE Canada We've worked with two highly ranked SEO's to try and figure this out, one in Canada, and one in the USA. I haven't seen a confidence inspiring answer from either of them. Posting on a forum is a bit of an act of desperation, I'll continue to work the problem, but it's discouraging to see the leader in my industry look like he's just phoning it in with his website.1 -
My WP website got attack by malware & now my website site:www.example.ca shows about 43000 indexed page in google.
Hi All My wordpress website got attack by malware last week. It affected my index page in google badly. my typical site:example.ca shows about 130 indexed pages on google. Now it shows about 43000 indexed pages. I had my server company tech support scan my site and clean the malware yesterday. But it still shows the same number of indexed page on google.
Technical SEO | | ChophelDoes anybody had ever experience such situation and how did you fixed it. Looking for help. Thanks FILE HIT LIST:
{YARA}Spam_PHP_WPVCD_ContentInjection : /home/example/public_html/wp-includes/wp-tmp.php
{YARA}Backdoor_PHP_WPVCD_Deployer : /home/example/public_html/wp-includes/wp-vcd.php
{YARA}Backdoor_PHP_WPVCD_Deployer : /home/example/public_html/wp-content/themes/oceanwp.zip
{YARA}webshell_webshell_cnseay02_1 : /home/example2/public_html/content.php
{YARA}eval_post : /home/example2/public_html/wp-includes/63292236.php
{YARA}webshell_webshell_cnseay02_1 : /home/example3/public_html/content.php
{YARA}eval_post : /home/example4/public_html/wp-admin/28855846.php
{HEX}php.generic.malware.442 : /home/example5/public_html/wp-22.php
{HEX}php.generic.cav7.421 : /home/example5/public_html/SEUN.php
{HEX}php.generic.malware.442 : /home/example5/public_html/Webhook.php0 -
Bingbot appears to be crawling a large site extremely frequently?
Hi All! What constitutes a normal crawl rate for daily bingbot server requests for large sites? Are any of you noticing spikes in Bingbot crawl activity? I did find a "mildly" useful thread at Black Hat World containing this quote: "The reason BingBot seems to be terrorizing your site is because of your site's architecture; it has to be misaligned. If you are like most people, you paid no attention to setting up your website to avoid this glitch. In the article referenced by Oxonbeef, the author's issue was that he was engaging in dynamic linking, which pretty much put the BingBot in a constant loop. You may have the same type or similar issue particularly if you set up a WP blog without setting the parameters for noindex from the get go." However, my gut instinct says this isn't it and that it's more likely that someone or something is spoofing bingbot. I'd love to hear what you guys think! Dana
Technical SEO | | danatanseo1 -
Does my "spam" site affect my other sites on the same IP?
I have a link directory called Liberty Resource Directory. It's the main site on my dedicated IP, all my other sites are Addon domains on top of it. While exploring the new MOZ spam ranking I saw that LRD (Liberty Resource Directory) has a spam score of 9/17 and that Google penalizes 71% of sites with a similar score. Fair enough, thin content, bunch of follow links (there's over 2,000 links by now), no problem. That site isn't for Google, it's for me. Question, does that site (and linking to my own sites on it) negatively affect my other sites on the same IP? If so, by how much? Does a simple noindex fix that potential issues? Bonus: How does one go about going through hundreds of pages with thousands of links, built with raw, plain text HTML to change things to nofollow? =/
Technical SEO | | eglove0 -
How to find all crawlable links on a particular page?
Hi! This might sound like a newbie question, but I'm trying to find all crawlable links (that google bot sees), on a particular page of my website. I'm trying to use screaming frog, but that gives me all the links on that particular page, AND all subsequent pages in the given sub-directory. What I want is ONLY the crawlable links pointing away from a particular page. What is the best way to go about this? Thanks in advance.
Technical SEO | | AB_Newbie0 -
How to create site map for large site (ecommerce type) that has 1000's if not 100,000 of pages.
I know this is kind of a newbie question but I am having an amazing amount of trouble creating a sitemap for our site Bestride.com. We just did a complete redesign (look and feel, functionality, the works) and now I am trying to create a site map. Most of the generators I have used "break" after reaching some number of pages. I am at a loss as to how to create the sitemap. Any help would be greatly appreciated! Thanks
Technical SEO | | BestRide0 -
Find all links in the site and anchor text
Hi, Find all links in the site and anchor text and i need this done on my own website so i know if we dont have links that are anchored to numbers and punctuations that are not seen at all. Thanks
Technical SEO | | mtthompsons0 -
Redirecting Entire Microsite Content to Main Site Internal Pages?
I am currently working on improving site authority for a client site. The main site has significant authority, but I have learned that the company owns several other resource-focused microsites which are stagnant, but which have accrued significant page authority of their own (thought still less than the main site). Realizing the fault in housing good content on a microsite rather than the main site, my thought is that I can redirect the content of the microsites to internal pages on the main site as a "Resources" section. I am wondering a: if this is a good idea and b: the best way to transfer site authority from these microsites. I am also wondering how to organize the content and if, for example, an entire microsite domain (e.g. microsite.com) should in fact be redirected to internal resource pages (e.g. mainsite.com/resources). Any input would be greatly appreciated!
Technical SEO | | RightlookCreative1