How do I find which pages are being deindexed on a large site?
-
Is there an easy way or any way to get a list of all deindexed pages?
Thanks for reading!
-
Hi Daniel
Yep - as Mat says there's no official solution to this. Do you mean deindexed by Google (without you wanting them to be) or deindexed by you on purpose?
I suppose you could also;
- crawl your whole site
- depending how big the site is, do a site: search in Google.
- use the SERPs redux bookmarklet - get all indexed URLs in a column in a spreadsheet
- compare your crawl vs. the list indexed and whichever was not present in the SERPs could have been deindexed
- this method is faulty as it assumes all crawled URLs were indexed in the first place - but could get you part of the way there.
-Dan
-
If you have a full list of URLs you could check for cache date on each at Google. Unless you were doing that manually it would be technically against google TOS, but so is SERP checking. More to the point I don't think it would be foolproof as indexed pages will sometimes return no cache date.
It's a bit of a convoluted method, but I think that might be your only option.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Optimization expert suggesting we add Canonical tag to every page on site
Hi guys, We're currently launching a new page, and we have an optimization and technical SEO expert (highly rated on Upwork, very intelligent, has solved complicated issues in the past and improved our Core Web Vitals greatly) suggesting we put canonical tags on every page of site, pointing to itself (other than the case of where canonicals should point to other page, we have those listed separately. Do you guys see a benefit to this? Could it harm us? He says large retailers do this, couldn't quite glean the benefit from it though. Current site ranks well and isn't set up like this. Any insight would be much appreciated! Thanks!
Technical SEO | | CitimarineMoz0 -
Is it problematic for Google when the site of a subdomain is on a different host than the site of the primary domain?
The Website on the subdomain runs on a different server (host) than the site on the main domain.
Technical SEO | | Christian_Campusjaeger0 -
Noindex search result pages Add Classifieds site
Dear All, Is it a good idea to noindex the search result pages of a classified site?
Technical SEO | | te_c
Taking into account that category pages are also search result pages, I would say it is not a good idea, but the whole information is in the sitemap, google can index individual listings (which are index, follow) anyway. What would you do? What kind of effects has in the indexing of the site, marking the search result pages as "search results" with schema.org microdata? Many thanks for your help, Best Regards, Daniel0 -
Mega Menus - Site Links - Bottom of the Page
Here are the questions: If you replace your top menu with a mega menu - like rei.com, target.com etc - that has dramatically more links and lots of non-optimized testimonials and calls for action, and locate the actual code of the mega menu at the bottom of the HTML , How will this affect your sitelinks? Will this now, make your on-page content more visible and indexable? Or does the Google bott dismiss this as just navigation content? In the past, I've have seen this technique work well, but that was before site links were easier to obtain. Looking at sites with virtually no navigation on their home pages and good authority, I've seen site links seemingly gleamed from alt attributes.
Technical SEO | | Runner20090 -
Top pages give " page not found"
A lot of my top pages point to images in a gallery on my site. When I click on the url under the name of the jpg file I get an error page not found. For instance this link: http://www.fastingfotografie.nl/architectuur-landschap/single-gallery/10162327 Is this a problem? Thanks. Thomas. JkLej.png
Technical SEO | | thomasfasting0 -
Redirecting a old aged site to a new exact match site?
Hi All, I have a question. I have 2 sites with me in the same sector and want some help. site 1 is a old site started back in 2003 and has some amount of links to it and has a pr 3 with some good links to it but doesn't rank much for any keywords for the timing. site 2 is a aged domain but newly developed with unique content and has a good amount of exact match with a .com version. so will there be any benefit by redirecting site 1 to site 2 to get the seo benefits and a start for link bulding? or is it best to develop and work on each site? the sector is health insurance. Thanks
Technical SEO | | macky71 -
For large sites, best practices for pages hidden behind internal search?
If a website has 1M+ pages, with most of them being hidden behind an internal search, what's the best way to get pages included in an engine's index? Does a direct clickpath to those pages need to exist from the homepage or other major hub pages on the site? Is submitting an XML sitemap enough?
Technical SEO | | vlevit0 -
Paginated Home Page Duplicates on Wordpress Sites
A number of my websites created on WP are displaying duplicate home pages with these types of urls. http://www.example.com/page/10/ http://www.example.com/page/11/ http://www.example.com/page/12/ I found these duplicates using the site:search command. Basically, put in any number and the Home Page opens. With the above mentioned url structure. Any idea on why they are created, how they can be stopped and what kind of an impact they would have in terms of SEO and the penalty that comes with duplicate content.
Technical SEO | | AsadMemon1