Seminar

Toward a Diagnostic Science of Deep Learning: A Case Study on Generalization Failures

LecturerDr. Chi-Ning Chou (Flatiron Institute)
Host: Kai-Min Chung
Time2026-01-29 (Thu.) 10:15 ~ 12:15
LocationAuditorium 107 at IIS new Building

Abstract

Generalization—the ability to perform well beyond the training context—is a hallmark of both biological and artificial intelligence, yet anticipating when models will fail remains a major challenge. Much of today's interpretability research takes a bottom-up approach, reverse-engineering internal features or circuits to build mechanistic explanations. However, as deep learning systems grow in scale and complexity, purely bottom-up, mathematics- and physics-style analyses struggle to capture their emergent behavior. Methodologies from biology and medicine—where diagnostic and biomarker-based reasoning often precede full mechanistic understanding—offer a complementary perspective.

In this talk, I will present a top-down framework for diagnosing failures in generalization. Rather than reconstructing detailed mechanisms, we use task-relevant geometric measures—what we call "biomarkers"—to link representational structure to function and identify indicators of robustness. In image classification, reductions in manifold dimensionality and utility consistently signal brittle generalization across architectures, optimizers, and datasets. Applying these geometric diagnostics to ImageNet-pretrained models, we show that they predict out-of-distribution transfer performance more reliably than conventional accuracy metrics. Together, our results point toward a diagnostic science of deep learning—one that uses system-level indicators, rather than low-level mechanisms alone, to expose hidden vulnerabilities and guide more reliable AI development.

Institute of Information Science, Academia Sinica

Events

Seminar

Toward a Diagnostic Science of Deep Learning: A Case Study on Generalization Failures

Abstract