Institute of Information Science
Recent Research Results
Current Research Results
"Three TF Co-expression Modules Regulate Pressure-Overload Cardiac Hypertrophy in Male Mice," Scientific Reports, To Appear.
Authors: Yao-Ming Chang, Li Ling, Ya-Ting Chang, Yu-Wang Chang, Wen-Hsiung Li, Arthur Chun-Chieh Shih*, and Chien-Chang Chen*

Arthur Chun-ChiehShihAbstract:
Pathological cardiac hypertrophy, a dynamic remodeling process, is a major risk factor for heart failure. Although a number of key regulators and related genes have been identified, how the transcription factors (TFs) dynamically regulate the associated genes and control the morphological and electrophysiological changes during the hypertrophic process are still largely unknown. In this study, we obtained the time-course transcriptomes at five time points in four weeks from male murine hearts subjected to transverse aorta banding surgery. From a series of computational analyses, we identified three major co-expression modules of TF genes that may regulate the gene expression changes during the development of cardiac hypertrophy in mice. After pressure overload, the TF genes in Module 1 were up-regulated before the occurrence of significant morphological changes and one week later were down-regulated gradually, while those in Modules 2 and 3 took over the regulation as the heart size increased. Our analyses revealed that the TF genes up-regulated at the early stages likely initiated the cascading regulation and most of the well-known cardiac miRNAs were up-regulated at later stages for suppression. In addition, the constructed time-dependent regulatory network reveals some TFs including Egr2 as new candidate key regulators of cardiovascularassociated (CV) genes.
"Virtual Persistent Cache: Remedy the Long Latency Behavior of Host-Aware Shingled Magnetic Recording Drives," ACM/IEEE International Conference on Computer-Aided Design (ICCAD), November 2017.
Authors: Ming-Chang Yang, Yuan-Hao Chang, Fenggang Wu, Tei-Wei Kuo, and David Hung-Chang Du

This paper presents a Virtual Persistent Cache design to remedy the long latency behavior of the Host-Aware Shingled Magnetic Recording (HA-SMR) drive. Our design keeps the cost-effective model of the existing HA-SMR drives, but at the same time asks the great help from the host system for adaptively providing some computing and management resources to improve the drive performance when needed. The technical contribution is to trick the HA-SMR drives by smartly reshaping the access patterns to HA-SMR drives, so as to avoid the occurrences of long latencies in most cases and thus to ultimately improve the drive performance and responsiveness. We conduct experiments on real Seagate 8TB HA-SMR drives to demonstrate the advantages of \\textit{Virtual Persistent Cache} over the real workloads from Microsoft Research Cambridge. The results show that the proposed design can avoid at least 94.93% of long latencies and improve the drive performance by at least 58.11%, under the evaluated workloads.
"Automatic Music Video Generation Based on Simultaneous Soundtrack Recommendation and Video Editing," ACM Multimedia Conference 2017, October 2017.
Authors: Jen-Chun Lin, Wen-Li Wei, James Yang, Hsin-Min Wang, and Hong-Yuan Mark Liao

An automated process that can suggest a soundtrack to a user-generated video (UGV) and make the UGV a music-compliant professional-like video is challenging but desirable. To this end, this paper presents an automatic music video (MV) generation system that conducts soundtrack recommendation and video editing simultaneously. Given a long UGV, it is first divided into a sequence of fixed-length short (e.g., 2 seconds) segments, and then a multi-task deep neural network (MDNN) is applied to predict the pseudo acoustic (music) features (or called the pseudo song) from the visual (video) features of each video segment. In this way, the distance between any pair of video and music segments of same length can be computed in the music feature space. Second, the sequence of pseudo acoustic (music) features of the UGV and the sequence of the acoustic (music) features of each music track in the music collection are temporarily aligned by the dynamic time warping (DTW) algorithm with a pseudo-song-based deep similarity matching (PDSM) metric. Third, for each music track, the video editing module selects and concatenates the segments of the UGV based on the target and concatenation costs given by a pseudo-song-based deep concatenation cost (PDCC) metric according to the DTW-aligned result to generate a music-compliant professional-like video. Finally, all the generated MVs are ranked, and the best MV is recommended to the user. The MDNN for pseudo song prediction and the PDSM and PDCC metrics are trained by an annotated official music video (OMV) corpus. The results of objective and subjective experiments demonstrate that the proposed system performs well and can generate appealing MVs with better viewing and listening experiences.
Current Research Results
"On Space Utilization Enhancement of File Systems for Embedded Storage Systems," ACM Transactions on Embedded Computing Systems (TECS), April 2017.
Authors: Tseng-Yi Chen, Yuan-Hao Chang, Shuo-Han Chen, Nien-I Hsu, Hsin-Wen Wei, and Wei-Kuan Shih

In the past decade, mobile/embedded computing systems conventionally have limited computing power, RAM space, and storage capacity due to the consideration of their cost, energy consumption, and physical size. Recently, some of these systems, such as mobile phone and embedded consumer electronics, have more powerful computing capability, so they manage their data in small flash storage devices  (e.g., eMMC and SD cards) with a simple file system. However, the existing file systems usually have low space utilization for managing small files and the tail data of large files. In this work, we thus propose a dynamic tail packing scheme to enhance the space utilization of file systems over flash storage devices in embedded computing systems by dynamically aggregating/packing the tail data of (small) files together. To evaluate the benefits and overheads of the proposed scheme, we theoretically formulate analysis equations for obtaining the best settings in the dynamic tail packing scheme. Additionally, the proposed scheme was implemented in the file system of Linux operating systems to evaluate its capability. The results demonstrate that the proposed scheme could significantly improve the space utilization of existing file systems.
Current Research Results
"Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
Authors: Huei-Fang Yang, Kevin Lin, and Chu-Song Chen

This paper presents a simple yet effective supervised deep hash approach that constructs binary hash codes from labeled data for large-scale image search. We assume that the semantic labels are governed by several latent attributes with each attribute on or off, and classification relies on these attributes. Based on this assumption, our approach, dubbed supervised semantics-preserving deep hashing (SSDH), constructs hash functions as a latent layer in a deep network and the binary codes are learned by minimizing an objective function defined over classification error and other desirable hash codes properties. With this design, SSDH has a nice characteristic that classification and retrieval are unified in a single learning model. Moreover, SSDH performs joint learning of image representations, hash codes, and classification in a point-wised manner, and thus is scalable to large-scale datasets. SSDH is simple and can be realized by a slight enhancement of an existing deep architecture for classification; yet it is effective and outperforms other hashing approaches on several benchmarks and large datasets. Compared with state-of-the-art approaches, SSDH achieves higher retrieval accuracy, while the classification performance is not sacrificed.
"Recognizing Offensive Tactics in Broadcast Basketball Videos via Key Player Detection," IEEE International Conference on Image Processing, October 2017.
Authors: T.Y. Tsai, Y. Y. Lin, H.Y. Mark Liao, and S. K. Jeng

We address offensive tactic recognition in broadcast basket- ball videos. As a crucial component towards basketball video content understanding, tactic recognition is quite challenging because it involves multiple independent players, each of which has respective spatial and temporal variations. Motivated by the observation that most intra-class variations are caused by non-key players, we present an approach that integrates key player detection into tactic recognition. To save the annotation cost, our approach can work on training data with only video-level tactic annotation, instead of key players labeling. Specifically, this task is formulated as an MIL (multiple instance learning) problem where a video is treated as a bag with its instances corresponding to subsets of the five players. We also propose a representation to encode the spatio-temporal interaction among multiple players. It turns out that our approach not only effectively recognizes the tac- tics but also precisely detects the key players.
"Distributed Compressive Sensing: Performance Analysis with Diverse Signal Ensembles," European Signal Processing Conference (EUSIPCO), August 2017.
Authors: Sung-Hsien Hsieh, Wei-Jie, Liang, Chun-Shien Lu, and Soo-Chang Pei

Distributed compressive sensing is a framework considering jointly sparsity within signal ensembles along with multiple measurement vectors (MMVs). The current theoretical bound of performance for MMVs, however, is derived to be the same with that for single MV (SMV) because the characteristics of signal ensembles are ignored. In this work, we introduce a new factor called “Euclidean distances between signals” for the performance analysis of a deterministic signal model under MMVs framework. We show that, by taking the size of signal ensembles into consideration, MMVs indeed exhibit better performance than SMV. Although our concept can be broadly applied to CS algorithms with MMVs, the case study conducted on a well-known greedy solver called simultaneous orthogonal matching pursuit (SOMP) will be explored in this paper. We show that the performance of SOMP, when incorporated with our concept by modifying the steps of support detection and signal estimations, will be improved remarkably, especially when the Euclidean distances between signals are short. The performance of modified SOMP is verified to meet our theoretical prediction.
"A System Calibration Model for Mobile PM2.5 Sensing Using Low-Cost Sensors," IEEE International Conference on Internet of Things (iThings'17), June 2017.
Authors: Hao-Min Liu, Hsuan-Cho Wu, Hu-Chen Lee, Yao-Hua Ho, and Ling-Jyh Chen

In this paper, we present a system calibration model (SCM) for mobile PM2.5 sensing systems using COTS low-cost particle sensors. To implement such systems, we first assess the accuracy of low-cost dust sensors and identify the most reliable sensor through a comprehensive set of evaluations. We also investigate the inner working principle of the selected sensor. By conducting a set of lab-scale controlled experiments, we obtained a logarithmic regression model that models the impacts of mobility and ambient wind velocity on PM2.5 sensing results. Moreover, using a low-cost water flow sensor, we design a customized micro anemometer and apply a linear regression model to convert the flow rate readings from the sensor to wind velocity values. Finally, we conduct a field experiment to evaluate the proposed calibration model in a real-world setting. The results show that the accuracy of the PM2.5 measurement results improves significantly when the model is utilized. The calibration model is simple and effective, and it can be utilized by other mobile sensing applications that facilitate micro-scale environmental sensing on the move.
"Dynamic Translation of Structured Loads/Stores and Register Mapping for Architectures with SIMD Extensions," ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools and Theory for Embedded Systems, June 2017.
Authors: Sheng-Yu Fu, Ding-Yong Hong, Yu-Ping Liu, Jan-Jan Wu, Wei-Chung Hsu

More and more modern processors have been supporting noncontiguous SIMD data accesses. However, translating such instructions has been overlooked in the Dynamic Binary Translation (DBT) area. For example, in the popular QEMU dynamic binary translator, guest memory instructions with strides are emulated by a sequence of scalar instructions, leaving a significant room for performance improvement when the host machines have SIMD instructions available. Structured loads/stores, such as VLDn/VSTn in ARM NEON, are one type of strided SIMD data access instructions. They are widely used in signal processing, multimedia, mathematical and 2D matrix transposition applications. Efficient translation of such structured loads/stores is a critical issue when migrating ARM executables to other ISAs. However, it is quite challenging since not only the translation of structured loads/stores is not trivial, but also the difference between guest and host register configurations must be taken into consideration. In this work, we present the design and implementation of translating structured loads/stores in DBT, including target code generation as well as efficient SIMD register mapping. Our proposed register mapping mechanisms are not limited to handling structured loads/stores, they can be extended to deal with normal SIMD instructions. On a set of OpenCV benchmarks, our QEMU-based system has achieved a maximum speedup of 5.41x, with an average improvement of 2.93x. On a set of BLAS benchmarks, our system has also obtained a maximum speedup of 2.19x and an average improvement of 1.63x.
Current Research Results
Authors: Chi-Han Lin, Kate Ching-Ju Lin, and W. T. Chen

Body area networks (BANs) enable wearable/implanted devices to exchange information or collect monitored data. The channel quality of a link in a BAN is typically highly dynamic, since sensors equipped on a human body usually move with gesture, posture, or mobility. Therefore, existing sleep-wake-up scheduling mechanisms used in traditional static sensor networks could be very inefficient in a BAN, because they do not consider channel fluctuation of body sensors. Sensors might be waked up to transmit during bad channel conditions, leading to transmission failures and energy waste. To remedy this inefficiency, this paper proposes a Channel-aware Polling-based MAC protocol CPMAC. Our design only wakes sensors up and triggers them to transmit when the channel is strong enough to ensure fast and reliable transmissions. We further analyze the energy consumption and derive a queueing model to estimate the probability of completing all data transmissions of all sensors in our CPMAC. Benefiting from these analyses, we are able to optimize energy efficiency of our CPMAC by adapting the number of polling periods in a superframe to dynamic traffic demands and channel fluctuation. Our simulation results show that, as compared with TDMA-based scheduling and the IEEE 802.15.6 CSMA/CA protocol, CPMAC significantly improves energy efficiency and, meanwhile, keeps the latency short.
Authors: Kunwoo Park, Meeyoung Cha, Haewoon Kwak, and Kuan-Ta Chen

Retaining players over an extended period of time is a long-standing challenge in game industry. Significant effort has been paid to understanding what motivates players enjoy games. While individuals may have varying reasons to play or abandon a game at different stages within the game, previous studies have looked at the retention problem from a snapshot view. This study, by analyzing in-game logs of 51,104 distinct individuals in an online multiplayer game, uniquely offers a multifaceted view of the retention problem over the players' virtual life phases. We find that key indicators of longevity change with the game level. Achievement features are important for players at the initial to the advanced phases, yet social features become the most predictive of longevity once players reach the highest level offered by the game. These findings have theoretical and practical implications for designing online games that are adaptive to meeting the players' needs.
"Towards a Better Learning of Near-Synonyms: Automatically Suggesting Example Sentences via Filling in the Blank," the 26th International World Wide Web Conference (WWW 2017), Digital Learning Track, 2017.
Authors: Chieh-Yang Huang, Mei-Hua Chen and Lun-Wei Ku

Language learners are confused by near-synonyms and often look for answers from the Web. However, there is little to aid them in sorting through the overwhelming load of information that is offered.  In this paper, we propose a new research problem:   suggesting  example  sentences  for  learning  word distinctions.  We focus on near-synonyms as the first step. Two kinds of one-class classifiers,  the GMM and BiLSTM models, are used to solve fill-in-the-blank (FITB) questions and further to select example sentences which best differentiate groups of near-synonyms. Experiments are conducted on both an open benchmark and a private dataset for the FITB task.  Experiments show that the proposed approach yields an accuracy of 73.05% and 83.59% respectively, comparable to state-of-the-art multi-class classifiers.  Learner study further  shows  the  results  of  the  example  sentence  suggestion by the learning effectiveness and demonstrates the proposed model  indeed  is  more  effective  in  learning  near-synonyms compared to the resource-based models.
Current Research Results
"Kart: a divide-and-conquer algorithm for NGS read alignment," Bioinformatics, To Appear.
Authors: Hsin-Nan Lin and Wen-Lian Hsu

Motivation: Next-generation sequencing (NGS) provides a great opportunity to investigate genome-wide variation at nucleotide resolution. Due to the huge amount of data, NGS applications require very fast and accurate alignment algorithms. Most existing algorithms for read mapping basically adopt seed-and-extend strategy, which is sequential in nature and take much longer time on longer reads.
Results: We develop a divide-and-conquer algorithm, called Kart, which can process long reads as fast as short reads by dividing a read into small fragments that can be aligned independently. Our experiment result indicates that the average size of fragments requiring gapped alignment is around 20bp regardless of the original read length. Furthermore, it can tolerate much higher error rates. The experiments show that Kart spends much less time on longer reads than other aligners and still produce reliable alignments even when the error rate is as high as 15%.
Availability: Kart is available at
Supplementary information: Supplementary data are available at Bioinformatics online.
"Deep-Net Fusion to Classify Shots in Concert Videos," IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017), March 2017.
Authors: Wen-Li Wei, Jen-Chun Lin, Tyng-Luh Liu, Yi-Hsuan Yang, Hsin-Min Wang, Hsiao-Rong Tyan, and Hong-Yuan Mark Liao

Varying types of shots is a fundamental element in the language of film, commonly used by a visual storytelling director to convey the emotion, ideas, and art. To classify such types of shots from images, we present a new framework that facilitates the intrigu- ing task by addressing two key issues. We first focus on learning more effective features by fusing the layer-wise outputs extracted from a deep convolutional neural network (CNN), pre-trained on a large-scale dataset for object recognition. We then introduce a probabilistic fusion model, termed as error weighted deep cross- correlation model (EW-Deep-CCM), to boost the classification accu- racy. Specifically, the deep neural network-based cross-correlation model (Deep-CCM) is constructed to not only model the extracted feature hierarchies of CNN independently but also relate the statis- tical dependencies of paired features from different layers. Then, a Bayesian error weighting scheme for classifier combination is adopted to explore the contributions from individual Deep-CCM classifiers to enhance the accuracy of shot classification. We provide extensive experimental results on a dataset of live concert videos to demonstrate the advantage of the proposed EW-Deep-CCM over existing popular fusion approaches. The video demos can be found at
Current Research Results
Authors: Wei-Jie, Liang, Gang-Xuan Lin, and Chun-Shien Lu

Cost-efficient compressive sensing of large-scale images with quickly reconstructed high-quality results is very challenging. In this paper, we present an algorithm to solve convex optimization via the tree structure sparsity pattern, which can be run in the operator to reduce computation cost and maintain good quality, especially for large-scale images. We also provide convergence analysis and convergence rate analysis for the proposed method. The feasibility of our method is verified through simulations and comparison with the state-of-the-art algorithms.
Current Research Results
"Systematic identification of anti-interferon function on hepatitis C virus genome reveals p7 as an immune evasion protein," Proceedings of the National Academy of Sciences of the United States of America (PNAS), January 2017.
Authors: Hangfei Qi, Virginia Chu, Nicholas C. Wu, Zugen Chen, Shawna Truong, Gurpreet Brar, Sheng-Yao Su, Yushen Du, Vaithilingaraja Arumugaswami, C. Anders Olson, Shu-Hua Chen, Chung-Yen Lin, Ting-Ting Wu, and Ren Sun

Understanding how viruses interact with their hosts, especially the mechanisms that restrict virus replication, will provide a molecular basis for vaccine development. However, the search for restriction factors is oftentimes difficult if the virus has already evolved to counteract the restriction. Here, we describe a systematic approach to identify such restriction and counterrestriction mechanisms. We constructed a library of mutant hepatitis C viruses, where each mutant has a 15-nt stretch randomly inserted on the genome. We aimed to identify mutations that lose the anti-IFN function, but maintain replication capacity. We have identified p7 as an immune evasion protein and further characterize the antiviral function of IFI6-16 against hepatitus C virus (HCV) replication.
"Enabling Write-Reduction Strategy for Journaling File Systems over Byte-addressable NVRAM," ACM/IEEE Design Automation Conference (DAC), June 2017.
Authors: Tseng-Yi Chen, Yuan-Hao Chang, Shuo-Han Chen, Chih-Ching Kuo, Ming-Chang Yang, Hsin-Wen Wei, and Wei-Kuan Shih

Non-volatile random-access memory (NVRAM) becomes a mainstream storage device in embedded systems due to its favorable features, such as small size, low power consumption, and short read/write latency. Unlike dynamic random access memory (DRAM), on NVRAM, a write operation consumes more energy and time than a read operation. However, current mobile/embedded file systems, such as EXT2/3 and EXT4, are very unfriendly for NVRAM devices. The reason is that a journaling mechanism writes the same data twice during data commitment and checkpoint. Such observations motivate this paper to design a two-phase write reduction journaling file system called wrJFS. In the first phase, wrJFS classified data into two categories: Metadata and user data. Metadata will be handled by partial byte-enabled journaling strategy, and user data will be processed in the second phase. In the second phase, user data will be compressed by hardware encoder so as to reduce the write size, and managed compressed-enabled journaling strategy to avoid the write amplification. The experimental results show that the proposed wrJFS can reduce the size of the write request by 89.7% on average, compared with the original EXT3.
"VirtualGC: Enabling Erase-free Garbage Collection to Upgrade the Performance of Rewritable SLC NAND Flash Memory," ACM/IEEE Design Automation Conference (DAC), June 2017.
Authors: Tseng-Yi Chen, Yuan-Hao Chang, Yuan-Hung Kuan, and Yu-Ming Chang,

Since 3D NAND flash memory could provide more reliable storage than a 2D planar flash memory by relaxing the design rule of a memory cell, a kind of brand new programming technique, namely erase-free scheme, has been proposed to further enhance the endurance of a 3D SLC NAND flash memory. The erase-free scheme brings tons of benefits to flash memory performance and endurance. For example, the erase-free scheme could reclaim invalid (page) space without physically erasing a flash block. However, current flash management designs could not fully exploit the benefits of the erase-free scheme. With the considerations of the features of the erase-free scheme, this paper is the first work to propose a novel flash management design, namely VirtualGC strategy, to deal with the erase-free garbage collection process. By taking the advantages of the erase-free scheme, the proposed strategy reduces the overhead of copying live pages so as to increase flash memory performance. The results show that the proposed strategy significantly improves the performance of rewritable 3D flash memory drives.
"A Pattern-aware Write Strategy to Enhance the Reliability of Flash-Memory Storage Systems," ACM Symposium on Applied Computing (SAC), April 2017.
Authors: Tseng-Yi Chen, Yuan-Hao Chang, Yuan-Hung Kuan, Ming-Chang Yang, Yu-Ming Chang, and Pi-Cheng Hsiu

Owing to high cell density caused by the advanced manufacturing process, the reliability of flash drives turns out to be rather challenging in flash system designs. In order to enhance the reliability of flash drives, error-correcting code (ECC) has been widely utilized in flash drives to correct error bits during programming/reading data to/from flash drives. Although ECC can effectively enhance the reliability of flash drives by correcting error bits, the capability of ECC would degrade while the program/erase (P/E) cycles of flash blocks is increased. Finally, ECC could not correct a flash page because a flash page contains too many error bits. As a result, reducing error bits is an effective solution to further improve the reliability of flash drives when a specific ECC is adopted in the flash drive. This work focuses on how to reduce the probability of producing error bits in a flash page. Thus, we propose a pattern-aware write strategy that allocates young blocks (i.e., blocks with low P/E cycles) for storing hot data and executes bit-flip operations on the written data so as to reduce the number of error bits in a flash page. By considering both the P/E cycles of blocks and the pattern of written data, the proposed pattern-aware write strategy can effectively improve the reliability of flash drives. The experimental results show that the proposed strategy can reduce the number of error pages by up to 40%, compared with the well-known DFTL solution. Moreover, the proposed strategy is orthogonal with all ECC mechanisms so that the reliability of the flash drives with ECC mechanisms can be further improved by the proposed strategy.
"Efficient Cache Update for In-Memory Cluster Computing with Spark," 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2017.
Authors: Li-Yung Ho, Jan-Jan Wu, Pangfeng Liu, Chia-Chun Shih, Chi-Chang Huang and Chao-Wen Huang

This paper proposes a scalable and efficient billing system for Chunghwa Telecom, the largest telecom company in Taiwan. We use the popular in-memory clustering computing framework – Spark, to develop our system. Despite the memory cache speeds up the data processing in Spark, its data immutability assumption makes the RDD replacement inefficient. To address this problem, we propose partial-update RDD, which enables users to replace individual partition of an RDD. We formulate this RDD partition problem, which address the issues of partition replacement efficiency. We develop two solutions to the problem – a dynamic programming algorithm and a nonlinear programming method. Experiment results suggest that, partial-update RDD achieves 4.32x speedup when compared with the original RDD in Spark. The proposed billing system outperforms the original billing system in CHT by a factor of 24x in throughput.
"General Randomness Amplification with Non-signaling Security," The 20th Annual Conference on Quantum Information Processing (QIP2017), January 2017.
Authors: Kai-Min Chung, Yaoyun Shi and Xiaodi Wu

Highly unpredictable events appear to be abundant in life. However, when modeled rigorously, their existence in nature is far from evident. In fact, the world can be deterministic while at the same time the predictions of quantum mechanics are consistent with observations. Assuming that randomness does exist but only in a weak form, could highly random events be possible? This fundamental question was first raised by Colbeck and Renner (Nature Physics, 8:450–453, 2012). In this work, we answer this question positively, without the various restrictions assumed in the previous works. More precisely, our protocol uses quantum devices, a single weak randomness source quantified by a general notion of non-signaling min-entropy, tolerates a constant amount of device imperfection, and the security is against an all-powerful non-signaling adversary. Unlike the previous works proving non-signaling security, our result does not rely on any structural restrictions or independence assumptions. Thus it implies a stronger interpretation of the dichotomy statement articulated by Gallego et al. (Nature Communications, 4:2654, 2013): “[e]ither our world is fully deterministic or there exist in nature events that are fully random.”

Note: This is a new work after our QIP 2014 paper, where the security proved is against a quantum, as opposed to non-signaling, adversary. 
Current Research Results
Authors: Anderson B. Mayfield, Yu-Bin Wang, Chii-Shiarng Chen, Shu-Hwa Chen, and Chung-Yen Lin*

As significant anthropogenic pressures are putting undue stress on the world's oceans, there has been a concerted effort to understand how marine organisms respond to environmental change. Transcriptomic approaches, in particular, have been readily employed to document the mRNA-level response of a plethora of marine invertebrates exposed to an array of simulated stress scenarios, with the tacit and untested assumption being that the respective proteins show a corresponding trend. To better understand the degree of congruency between mRNA and protein expression in an endosymbiotic marine invertebrate, mRNAs and proteins were sequenced from the same samples of the common, Indo-Pacific coral Seriatopora hystrix exposed to stable or upwelling-simulating conditions for 1 week. Of the 167 proteins downregulated at variable temperature, only two were associated with mRNAs that were also differentially expressed between treatments. Of the 378 differentially expressed genes, none were associated with a differentially expressed protein. Collectively, these results highlight the inherent risk of inferring cellular behaviour based on mRNA expression data alone and challenge the current, mRNA-focused approach taken by most marine and many molecular biologists.

Reference website: .
Current Research Results
"Utilization-aware Self-tuning Design for TLC Flash Storage Devices," IEEE Transactions on Very Large Scale Integration Systems (TVLSI), October 2016.
Authors: Ming-Chang Yang, Yuan-Hao Chang, Che-Wei Tsao, and Chung-Yu Liu

The high-density, low-cost triple-level-cell (TLC) flash memory has gradually dominated the flash storage market because of the fast-growing demand for storage capacity. However, the advances of manufacturing technologies also make TLC flash memory suffer serious performance degradation compared with the low-density, high-performance single-level-cell (SLC) flash memory. To address this issue, some vendors enable blocks of TLC flash memory to work as high-performance, low-density SLC blocks. In contrast to the past research that allocates a fixed number of TLC blocks as SLC blocks to improve the device performance to a certain degree, we propose a utilization-aware self-tuning design to trade more unused storage capacity for better system performance. The introduced design dynamically adjusts and maximizes the number of SLC blocks according to the amount of data stored in the storage device at runtime. With the self-tuning design, a flash storage device can not only achieve high access performance but also provide enough storage capacity. The performance and capability of proposed design were evaluated by a series of experiments, and the results are very encouraging.
Current Research Results
"Graceful Space Degradation: An Uneven Space Management for Flash Storage Devices," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), September 2016.
Authors: Ming-Chang Yang, Yuan-Hao Chang, Yuan-Hung Kuan, and Che-Wei Tsao

The high cell density, multilevel-cell programming, and manufacturing process variance force the new coming flash memory to have large bit-error-rate variance among blocks and pages, where a flash chip consists of multiple blocks and each block consists of a fixed number of pages. In order to avoid storing the crucial user data in more fragile pages, conventional flash management software tends to aggressively discard the high bit-error-rate area in the unit of a block. However, together with the aggressive discarding strategies and the enlarging sizes of pages/blocks of next generation flash memory, the available space of flash devices might encounter a very sharp degradation and therefore result in rapidly-shortened device lifespan. Thus, we advocate the concept of “graceful space degradation” to mitigate this problem by discarding the high bit-error-rate (or worn-out) area in the unit of pages (instead of blocks). To furthermore realize this concept, we are the pioneer to put forward an “uneven space management” to manage flash blocks containing different number of bad pages. Our design especially focuses on placing data with different access behaviors to make the best uses of blocks with different available space so as to ultimately prolong the device lifespan with good access performance. The experiments were conducted based on representative realistic workloads, and the results reveal that the proposed design can extend the device lifetime by at least 2.38 times of that of existent approaches, with very limited performance overheads.


Academia Sinica Institue of Information Science Academia Sinica