您的瀏覽器不支援JavaScript語法,網站的部份功能在JavaScript沒有啟用的狀態下無法正常使用。

中央研究院 資訊科學研究所

活動訊息

友善列印

列印可使用瀏覽器提供的(Ctrl+P)功能

學術演講

:::

Deep Learning for Multimedia Content Analysis

  • 講者林仁俊 教授 (元智大學電機工程學系)
    邀請人:劉庭祿
  • 時間2019-07-24 (Wed.) 10:00 ~ 12:00
  • 地點資訊所舊館108演講廳
摘要

In this talk, I will cover two topics which are closely related to multimedia. The first one is automatic music video generation, and the second one is automatic concert video mashup. An automated process that can suggest a soundtrack to a user-generated video (UGV) and make the UGV a music-compliant professional-like video is challenging but desirable. To this end, in the first topic we introduce a systematic approach to link a multi-task deep-net model with DTW-based metric learning, and then use it to perform music video (MV) generation. The results of objective and subjective experiments demonstrate that the proposed system performs well and can generate appealing MVs with better viewing and listening experiences. In the second topic, we aim to classify the types of shots defined by the language of film for better portraying visual storytelling in a concert video, and plan to incorporate the technique in our upcoming effort for building an automatic mashup platform. Varying types of shots are fundamental elements in the language of film, commonly used by a visual storytelling director to convey the emotion, ideas, and art. To classify such types of shots from images, we propose a novel probabilistic-based deep-net framework, and term the resulting deep-net model as Coherent Classification Net, abbreviated as CC-Net, to boost the classification accuracy. We provide extensive experiment results on a dataset of live concert videos to demonstrate the advantage of the proposed approach.

BIO

Jen-Chun Lin received the Ph.D. degree in computer science and information engineering from National Cheng Kung University, Tainan, Taiwan, in 2014. He was a Post-Doctoral Research Fellow with Academia Sinica, Taipei, Taiwan, from 2014 to 2018. He is currently an Assistant Professor with the Department of Electrical Engineering, Yuan Ze University, Taoyuan, Taiwan. The illustration in his IEEE/ACM Transactions on Audio, Speech, and Language Processing (March/April issue) 2014 paper has been chosen to highlight the cover of the journal. He also received the Gold Award from Merry Electroacoustic Thesis Award in 2014; the Excellent PhD Dissertation Award from Chinese Image Processing and Pattern Recognition Society in 2014; the Excellent PhD Dissertation Award from Taiwanese Association for Artificial Intelligence in 2014; the Most Interesting Paper Award from Affective Social Multimedia Computing Workshop in 2015; the Postdoctoral Academic Publication Award, Ministry of Science and Technology (MOST) in 2017; and the APSIPA Sadaoki Furui Prize Paper Award in 2018. His research interests include multimedia signal processing, pattern analysis and recognition, machine learning, deep learning, and affective computing.