How do I find which pages are being deindexed on a large site?
-
Is there an easy way or any way to get a list of all deindexed pages?
Thanks for reading!
-
Hi Daniel
Yep - as Mat says there's no official solution to this. Do you mean deindexed by Google (without you wanting them to be) or deindexed by you on purpose?
I suppose you could also;
- crawl your whole site
- depending how big the site is, do a site: search in Google.
- use the SERPs redux bookmarklet - get all indexed URLs in a column in a spreadsheet
- compare your crawl vs. the list indexed and whichever was not present in the SERPs could have been deindexed
- this method is faulty as it assumes all crawled URLs were indexed in the first place - but could get you part of the way there.
-Dan
-
If you have a full list of URLs you could check for cache date on each at Google. Unless you were doing that manually it would be technically against google TOS, but so is SERP checking. More to the point I don't think it would be foolproof as indexed pages will sometimes return no cache date.
It's a bit of a convoluted method, but I think that might be your only option.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site:www.domainname.com - does not find homepage in Google (only inner pages - why?)
When I do a Google search on site:www.domainname.com, my clients homepage does not appear. Other inner pages do. The same thing happend a while ago and I did 'fetch by google' in Search Console. After that the homepage was indexed again when I did a site:www.domainname.com search. But now (2 weeks later), it's gone again. When I search on the brand name of the website in Google it does find the homepage. I don't know why it doesn't find the homepage when I do a site: search. Any ideas? [see images where you can see the problem] XTrDn 2doHF
Technical SEO | | robk1230 -
Google has deindexed 40% of my site because it's having problems crawling it
Hi Last week i got my fifth email saying 'Google can't access your site'. The first one i got in early November. Since then my site has gone from almost 80k pages indexed to less than 45k pages and the number is lowering even though we post daily about 100 new articles (it's a online newspaper). The site i'm talking about is http://www.gazetaexpress.com/ We have to deal with DDoS attacks most of the time, so our server guy has implemented a firewall to protect the site from these attacks. We suspect that it's the firewall that is blocking google bots to crawl and index our site. But then things get more interesting, some parts of the site are being crawled regularly and some others not at all. If the firewall was to stop google bots from crawling the site, why some parts of the site are being crawled with no problems and others aren't? In the screenshot attached to this post you will see how Google Webmasters is reporting these errors. In this link, it says that if 'Error' status happens again you should contact Google Webmaster support because something is preventing Google to fetch the site. I used the Feedback form in Google Webmasters to report this error about two months ago but haven't heard from them. Did i use the wrong form to contact them, if yes how can i reach them and tell about my problem? If you need more details feel free to ask. I will appreciate any help. Thank you in advance C43svbv.png?1
Technical SEO | | Bajram.Kurtishaj1 -
What is the best way to find missing alt tags on my site (site wide - not page by page)?
I am looking to find all the missing alt tags on my site at once. I have a FF extension that use to do it page by page, but my site is huge and that will take forever. Thanks!!
Technical SEO | | franchisesolutions1 -
If a permanent redirect is supposed to transfer SEO from the old page to the new page, why has my domain authority been impacted?
For example, we redirected our old domain to a new one (leaving no duplicate content on the old domain) and saw a 40% decrease in domain authority. Isn't a permanent redirect supposed to transfer link authority to the place it is redirecting to? Did I do something wrong?
Technical SEO | | BlueLinkERP0 -
ECommerce site - Duplicate pages problem.
We have an eCommerce site with multiple products being displayed on a number of pages. We use rel="next" and rel="prev" and have a display ALL which I understand Google should automatically be able to find. Should we also being using a Canonical tag as well to tell google to give authority to the first page or the All Pages. Or was the use of the next and prev rel tags that we currently do adequate. We currently display 20 products per page, we were thinking of increasing this to make fewer pages but they would be better as this which would make some later product pages redundant . If we add 301 redirects on the redundant pages, does anyone know of the sort of impact this might cause to traffic and seo ?. General thoughts if anyone has similar problems welcome
Technical SEO | | SarahCollins0 -
How to find out if I have been penalized?
I have launched a new website beginning January this year and have seen slowly more and more traffic coming from google to the website until the 20th of March where suddenly there are no more visitors from the google search engine. The only traffic left is from google images, social networks or other search engines. Without visitors from google search this reduces our overall traffic by ~66%. I can't easily find anymore our website in the search results of google by using terms which we usually ranked quite well. Nevertheless, the website is still indexed as I can find it using the "site:" search query. In google webmaster tools there are no messages and we have only been doing a bit of link building on website and blog directories (nothing excessive and nothing paid neither). Is there any way to find out if google penalized my website? I guess it has... and what would be the best thing to do right now? The website is hellasholiday (dot) com Thanks in advance for your idea and suggestions
Technical SEO | | socialtowards0 -
I have a site that has both http:// and https:// versions indexed, e.g. https://www.homepage.com/ and http://www.homepage.com/. How do I de-index the https// versions without losing the link juice that is going to the https://homepage.com/ pages?
I can't 301 https// to http:// since there are some form pages that need to be https:// The site has 20,000 + pages so individually 301ing each page would be a nightmare. Any suggestions would be greatly appreciated.
Technical SEO | | fthead90 -
If I redirect my WordPress blog to my main site, will it help my main site's SEO?
I have separate sites for my blog and main website. I'd like to link them in a way that enables the blog to boost my main site's SEO. Is there an easy way to do this? Thanks in advance for any advice...
Technical SEO | | matt-145670