Toward a Diagnostic Science of Deep Learning: A Case Study on Generalization Failures
- 講者周紀寧 博士 (美國弗萊特隆研究所)
邀請人:鐘楷閔 - 時間2026-01-29 (Thu.) 10:15 ~ 12:15
- 地點資訊所新館101演講廳
摘要
Generalization—the ability to perform well beyond the training context—is a hallmark of both biological and artificial intelligence, yet anticipating when models will fail remains a major challenge. Much of today's interpretability research takes a bottom-up approach, reverse-engineering internal features or circuits to build mechanistic explanations. However, as deep learning systems grow in scale and complexity, purely bottom-up, mathematics- and physics-style analyses struggle to capture their emergent behavior. Methodologies from biology and medicine—where diagnostic and biomarker-based reasoning often precede full mechanistic understanding—offer a complementary perspective.
In this talk, I will present a top-down framework for diagnosing failures in generalization. Rather than reconstructing detailed mechanisms, we use task-relevant geometric measures—what we call "biomarkers"—to link representational structure to function and identify indicators of robustness. In image classification, reductions in manifold dimensionality and utility consistently signal brittle generalization across architectures, optimizers, and datasets. Applying these geometric diagnostics to ImageNet-pretrained models, we show that they predict out-of-distribution transfer performance more reliably than conventional accuracy metrics. Together, our results point toward a diagnostic science of deep learning—one that uses system-level indicators, rather than low-level mechanisms alone, to expose hidden vulnerabilities and guide more reliable AI development.