Page 52 - My FlipBook
P. 52
人
智慧
計
畫 Deep Learning for Multimedia Information Processing
Arti cial Intelligence Projects Principal Investigator: Dr. Mark Liao
Project Period: 2018/1~2021/12
My research team was awarded funding of ~USD$280,000/ industry would inevitably decline. We began researching
year for a four-year AI-related project from the Ministry this topic by focusing on multimedia information
of Science and Technology that began in 2018 (MOST) processing combined with deep learning. Given that MOST
(2018/1~2021/12). The expertise of my team lies in was concerned that research teams would solely focus on
multimedia signal and information processing. MOST publications as output and strongly endorsed cooperation
encourages research teams with diverse training with Taiwanese industry, my research team decided to
backgrounds to investigate AI-related theory or applied collaborate with the listed integrated circuits company,
research. MOST recognizes that AI will become a very ELAN. We hope to integrate software and hardware
important topic in the future and, if Taiwan did not invest to develop "smart city traffic flow solutions" that are
resources accordingly, it’s competitiveness in science and applicable not only to Taiwan, but beyond.
We expect to complete the following systems within four years:
1. A sub-system that can directly execute edge computing at intersections to compute various tra c ow parameters;
2. A sub-system that uses computer vision technology to compute other traffic flow parameters between adjacent
intersections;
3. A sub-system that can make use of the aforementioned parameters to trigger a reinforcement learning process, allowing
dynamic adjustment of all tra c signs within a certain range.
In the first year of the project (2018), my research team Figure 1 : Tra c ow, as detected by our deep learning-trained
deployed a 360° fisheye camcorder trained with deep sheye camcorder.
learning (based on YOLOv3) to perform traffic flow
detection and computation at an intersection. We
encountered two major difficulties. Firstly, all calculations
had to be executed by edge computing, such as by
using Nvidia Jetson TX2. TX2 has a computing power
representing only 5% of a GTX 1080 Ti graphics card, so it is
extremely di cult to calculate a huge amount of data and
maintain an acceptable accuracy rate. Secondly, in order to
cover the entire intersection with a minimum number of
cameras, ELAN required 360° fisheye camcorders to meet
their requirements. However, sheye camcorders generate
distorted images relative to those recorded by normal
camcorders. Therefore, we used YOLOv3-tiny to modify the
model so that it could perform traffic flow detection and
computation in the distorted image space. The resulting
product was submitted by ELAN to Computex Taipei
where it won a Golden Award in the Best Choice Award
competition from more than 550 competing products (only
8 products won Golden Awards). In Figure 1, we show how
our sheye camcorder detects tra c ow at intersections.
50
智慧
計
畫 Deep Learning for Multimedia Information Processing
Arti cial Intelligence Projects Principal Investigator: Dr. Mark Liao
Project Period: 2018/1~2021/12
My research team was awarded funding of ~USD$280,000/ industry would inevitably decline. We began researching
year for a four-year AI-related project from the Ministry this topic by focusing on multimedia information
of Science and Technology that began in 2018 (MOST) processing combined with deep learning. Given that MOST
(2018/1~2021/12). The expertise of my team lies in was concerned that research teams would solely focus on
multimedia signal and information processing. MOST publications as output and strongly endorsed cooperation
encourages research teams with diverse training with Taiwanese industry, my research team decided to
backgrounds to investigate AI-related theory or applied collaborate with the listed integrated circuits company,
research. MOST recognizes that AI will become a very ELAN. We hope to integrate software and hardware
important topic in the future and, if Taiwan did not invest to develop "smart city traffic flow solutions" that are
resources accordingly, it’s competitiveness in science and applicable not only to Taiwan, but beyond.
We expect to complete the following systems within four years:
1. A sub-system that can directly execute edge computing at intersections to compute various tra c ow parameters;
2. A sub-system that uses computer vision technology to compute other traffic flow parameters between adjacent
intersections;
3. A sub-system that can make use of the aforementioned parameters to trigger a reinforcement learning process, allowing
dynamic adjustment of all tra c signs within a certain range.
In the first year of the project (2018), my research team Figure 1 : Tra c ow, as detected by our deep learning-trained
deployed a 360° fisheye camcorder trained with deep sheye camcorder.
learning (based on YOLOv3) to perform traffic flow
detection and computation at an intersection. We
encountered two major difficulties. Firstly, all calculations
had to be executed by edge computing, such as by
using Nvidia Jetson TX2. TX2 has a computing power
representing only 5% of a GTX 1080 Ti graphics card, so it is
extremely di cult to calculate a huge amount of data and
maintain an acceptable accuracy rate. Secondly, in order to
cover the entire intersection with a minimum number of
cameras, ELAN required 360° fisheye camcorders to meet
their requirements. However, sheye camcorders generate
distorted images relative to those recorded by normal
camcorders. Therefore, we used YOLOv3-tiny to modify the
model so that it could perform traffic flow detection and
computation in the distorted image space. The resulting
product was submitted by ELAN to Computex Taipei
where it won a Golden Award in the Best Choice Award
competition from more than 550 competing products (only
8 products won Golden Awards). In Figure 1, we show how
our sheye camcorder detects tra c ow at intersections.
50