Institute of Information Science Academia Sinica
Topic: Label Space Coding for Multi-label Classification
Speaker: Prof. Hsuan-Tien Lin (Department of Computer Science and Information Engineering, National Taiwan University)
Date: 2012-04-26 (Thu) 10:30 – 12:00
Location: Auditorium 106 at new IIS Building
Host: Chi-Jen Lu

Abstract:

Multiclass classification is an important problem in machine learning. It can be used in a variety of applications, such as organizing documents to different categories automatically. Multi-label classification is an extension of multi-class classification --- the former allows a set of labels to be associated with an instance while the latter allows only one. For instance, a document may belong to both the "politics" and "health" class if it is about the National Health Insurance. Many other similar applications arise in domains like text mining, vision, or bio-informatics.

In this talk, we discuss a coding view about the output (label) space of multi-label classification. The view represents each set of possible labels as a (fixed-length) binary string. We discuss the close connection between the binary-string representation and the coding theory. In particular, we demonstrate two novel research directions based on the connection: data compression (source coding) and error correction (channel coding). We discuss an algorithm that systematically compresses the label space for more efficient computation, and another algorithm that systematically expands the label space for better performance.

The talk comes from some joint works with Farbound Tai (Neural Computation, 2012) and Chun-Sung Ferng (ACML, 2011). It is self-contained and assumes only basic background in machine learning and coding theory.


BIO:

Hsuan-Tien Lin received a B.S. in Computer Science and Information Engineering from National Taiwan University in 2001, an M.S. and a Ph.D. in Computer Science from California Institute of Technology in 2005 and 2008, respectively. He joined the Department of Computer Science and Information Engineering at National Taiwan University as an assistant professor in 2008, and won the outstanding teaching award from the university in 2011. His research interests include theoretical foundations of machine learning, studies on new learning problems, and improvements on learning algorithms. He received the 2012 K.-T. Li Young Researcher Award from the ACM Taipei Chapter, and co-led the teams that won the third place of KDDCup 2009 slow track, the champion of KDDCup 2010, and the double-champion of the two tracks in KDDCup 2011.