Page 36 - My FlipBook

P. 36

人

智慧
計
畫 Deep Learning for Emerging Computer Vision Applications

Arti cial Intelligence Projects Principal Investigator: Dr. Tyng-Luh Liu
Project Period: 2018/1~2021/12

The primary goal of this four-year project is to develop image segmentation―this project will advance available
deep learning techniques related to emerging computer methods by incorporating powerful deep learning
vision applications. Our research e orts focus on addressing approaches. We are also interested in combining computer
crucial issues in designing deep learning techniques, vision and natural language processing techniques for
including how to regulate layer-wise feature distributions emerging computer vision applications. In addition, we are
and how to effectively carry out network architecture working closely with industry to identify aspects of smart
searches, amongst other topics. Since availability and manufacture and smart retail for targeted intervention.
quality of annotated training data can vary substantially Currently, we are designing hardware-aware network
in practical applications, we also intend to establish simplification techniques so that our proposed deep
learning frameworks that, apart from supervised settings, learning methods can be e ciently ported to target edge
account for semi-supervised, weakly supervised, or few- devices without signi cantly degrading their performance.
shot learning scenarios. Leveraging our past successes Below is a brief description of key results arising from our
in dealing with conventional problems in computer research e orts over the rst two years of this project.
vision―such as object detection, object recognition and

GAN-inspired computer vision techniques

Inspired by the impressive performance gain owing to in CVPR 2018. In dealing with the problem of image
training a DNN with the GAN-like informative feedback segmentation, we design a GAN-based unsupervised
versus without such additional information, we set out learning mechanism to model a general figure-ground
to investigate the effects of training a DNN with regard concept without relying on explicit pixel-level annotations.
to various forms of useful feedback, such as network More specifically, we formulate the meta-learning process
aggregations, attention cues, memory cues, local-vs- as a compositional image editing task that learns to imitate
global information, multi-modality fusion, amongst a certain visual e ect and derive the corresponding internal
others. For object detection, we propose non-local ROIs representation by exploring webly-abundant images of
to augment feature representations with bounding box- visual e ects. (See Figure 1.) This work is published in AAAI
wise correlations. This technique wins the rst place of the 2019 and our proposed unsupervised scheme is now being
instance segmentation contest of robust vision challenge used extensively to solve practical tasks.

Figure 1 : Visual-E ect GAN (VEGAN) for gure-ground segmentation.

34

31 32 33 34 35 36 37 38 39 40 41