Self-supervised learning (SSL) has shown to be vital for advancing research in natural language processing (NLP), computer vision (CV), and speech processing. The paradigm pre-trains a shared model on large volumes of unlabeled data and achieves state-of-the-art for various tasks with minimal adaptation. This talk first introduces the Speech processing Universal PERformance Benchmark (SUPERB), which is a leaderboard to benchmark the performance of SSL model across a wide range of speech processing tasks. The results on SUPERB demonstrate that SSL representations show competitive generalizability across speech processing tasks. This talk will also share some surprising new findings that SSL models pretrained from text are helpful for non-text token sequence classification data, including amino acid, DNA, and music.
Hung-yi Lee is an associate professor of the Department of Electrical Engineering of National Taiwan University (NTU), with a joint appointment at the Department of Computer Science & Information Engineering of the university. His recent research focuses on developing technology that can reduce the requirement of annotated data for speech processing (including voice conversion and speech recognition) and natural language processing (including abstractive summarization and question answering). He won Salesforce Research Deep Learning Grant in 2019, AWS ML Research Award in 2020, Outstanding Young Engineer Award from The Chinese Institute of Electrical Engineering in 2018, Young Scholar Innovation Award from Foundation for the Advancement of Outstanding Scholarship in 2019, Ta-You Wu Memorial Award from Ministry of Science and Technology of Taiwan in 2019, and The 59th Ten Outstanding Young Person Award in Science and Technology Research & Development of Taiwan. He owns a YouTube channel teaching deep learning in Mandarin with about 100k Subscribers.