How do I find which pages are being deindexed on a large site?
-
Is there an easy way or any way to get a list of all deindexed pages?
Thanks for reading!
-
Hi Daniel
Yep - as Mat says there's no official solution to this. Do you mean deindexed by Google (without you wanting them to be) or deindexed by you on purpose?
I suppose you could also;
- crawl your whole site
- depending how big the site is, do a site: search in Google.
- use the SERPs redux bookmarklet - get all indexed URLs in a column in a spreadsheet
- compare your crawl vs. the list indexed and whichever was not present in the SERPs could have been deindexed
- this method is faulty as it assumes all crawled URLs were indexed in the first place - but could get you part of the way there.
-Dan
-
If you have a full list of URLs you could check for cache date on each at Google. Unless you were doing that manually it would be technically against google TOS, but so is SERP checking. More to the point I don't think it would be foolproof as indexed pages will sometimes return no cache date.
It's a bit of a convoluted method, but I think that might be your only option.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will putting a one page site up for all other countries stop Googlebot from crawling my UK website?
I have a client that only wants UK users to be able to purchase from the UK site. Currently, there are customers from the US and other countries purchasing from the UK site. They want to have a single webpage that is displayed to users trying to access the UK site that are outside the UK. This is fine but what impact would this have on Google bots trying to crawl the UK website? I have scoured the web for an answer but can't find one. Any help will be greatly appreciated. Thanks 🙂
Technical SEO | | lbagley0 -
Site splitting value of our pages with multiple variations. How can I fix this with the least impact?
Just started at a company recently, and there is a preexisting problem that I could use some help with. Somebody please tell me there is a low impact fix for this: My company's website is structured so all of the main links used on the nav are listed as .asp pages. All the canonical stuff. However, for "SEO Purposes," we have a number of similar (not exact) pages in .html on the same topic on our site. So, for example, let's say we're a bakery. The main URL, as linked in the nav, for our Chocolate Cakes, would be http://www.oursite.com/chocolate-cakes.asp. This differentiates the page from our other cake varieties, such as http://www.oursite.com/pound-cakes.asp and http://www.oursite.com/carrot-cakes.asp. Alas, fully indexed in Google with links existing only in our sitemap, we also have: http://www.oursite.com/chocolate-cakes.html http://www.oursite.com/chocolatecakes.html http://www.oursite.com/cakes-chocolate.html This seems CRAZY to me, because wouldn't this split our search results 4 ways? Am I right in assuming this is destroying the rankings of our canonical pages? I want to change this, but problem is, none of the content is the same on any of the variants, and some of these pages rank really well - albeit mostly for long tail keywords instead of the good, solid keywords we're after. So, what I'm asking you guys is: How do I burn these .html pages to the ground without completely destroying our rankings for the other keywords? I want to 301 those pages to our canonical nav URLs but, because of the wildly different content, I'm afraid that we could see a heavy drop in search traffic. Am I just being overly cautious? Thanks in advance!
Technical SEO | | jdsnyc20 -
How do I influence what page on my site google shows for specific search phrases?
Hi People, My client has a site www.activeadventures.com. They provide adventure tours of New Zealand, South America and the Himalayas. These destinations are split into 3 folders in the site (eg: activeadventures.com/new-zealand, activeadventures.com/south-america etc....). The actual root folder of the site is generic information for all of the destinations whilst the destination specific folders are specific in their information for the destination in question. The Problem: If you search for say "Active New Zealand" or "Adventure Tours South America" our result that comes up is the activeadventures.com homepage rather than the destination folder homepage (eg: We would want activeadventures.com/new-zealand to be the landing page for people searching for "active new zealand"). Are there any ways in influence google as to what page on our site it chooses to serve up? Many thanks in advance. Conrad
Technical SEO | | activenz0 -
Noindex search result pages Add Classifieds site
Dear All, Is it a good idea to noindex the search result pages of a classified site?
Technical SEO | | te_c
Taking into account that category pages are also search result pages, I would say it is not a good idea, but the whole information is in the sitemap, google can index individual listings (which are index, follow) anyway. What would you do? What kind of effects has in the indexing of the site, marking the search result pages as "search results" with schema.org microdata? Many thanks for your help, Best Regards, Daniel0 -
Onsite SEO Strategy for a large accommodation site
Hi All I have been thinking about the best strategy for keyword optimisation on a forthcoming accommodation website I am involved with. This may be a bit of a newbie type question, but most of my work has been on considerably smaller sites to date.... Lets say the site will have 1 primary landing page for "Hotels in Bristol" and then 50 pages that are each for a hotel in Bristol. The aim would be for the primary page which will be a browse/search result type page to rank well for the term 'Hotels in Bristol' and other similar terms. If each of the hotel listing pages that have a hotel in Bristol on, have the phrase 'Hotel in Bristol' contained within the title, url, page content, maybe headings/alt tags etc. will the result be that the rank for the site is 'spread too thin' across the domain? Whats the best way to drive all the relevancy and keyword usage on the 50 listing pages, to the primary page such that that is the one that ranks well? And the other pages rank more for the hotel name etc? I guess one way would be to avoid using the words hotels and Bristol in the title/URL etc.. but the natural approach for usability (not SEO) would be to use these words i.e. http://www.newtravelsite.com/hotels/bristol/stgeorgeshotel/ Or would each of the 50 listing pages simply need a followed, anchored link pointing the main landing page? I'm sure there may be a fundamental technique to do this that has alluded me so far, but any help, thoughts or guidance much appreciated! Regards Simon
Technical SEO | | SCL-SEO0 -
I am trying to correct error report of duplicate page content. However I am unable to find in over 100 blogs the page which contains similar content to the page SEOmoz reported as having similar content is my only option to just dlete the blog page?
I am trying to correct duplicate content. However SEOmoz only reports and shows the page of duplicate content. I have 5 years worth of blogs and cannot find the duplicate page. Is my only option to just delete the page to improve my rankings. Brooke
Technical SEO | | wianno1680 -
How many pages should my site have?
Right now I think I only have 36. What is a good amount of pages to have? Any ideas on ways to add relevant pages to my site? I was thinking about starting a message board. Also, I have a free tech support chat room, and was thinking about posting the logs somewhere on the site. Does that sound like a good idea? Thanks.
Technical SEO | | eugenecomputergeeks0 -
If googlebot fetch doesnt find our site will it be indexed?
We have a problem with one of our sites (wordpress) not getting fetched by googlebot. Some folders on the url get found others not. So we have isolated it as a wordpress issue. Will this affect our page in google serps anytime soon? Does any whizz kid out there know how to begin fixing this as we have spent two days solid on this. url is www.holden-jones.co.uk Thanks in advance guys Rob
Technical SEO | | wonderwall0