中央研究院 資訊科學研究所




Learning To Visualize Music Through Shot Sequence For Automatic Concert Video Mashup

IEEE Transactions on Multimedia, To Appear

Wen-Li Wei, Jen-Chun Lin, Tyng-Luh Liu, Hsiao-Rong Tyan, Hsin-Min Wang, and Hong-Yuan Mark Liao

Jen-Chun Lin Tyng-Luh Liu Hsin-Min Wang Hong-Yuan Mark Liao

An experienced director usually switches among different types of shots to make visual storytelling more touching. When filming a musical performance, appropriate switching shots can produce some special effects, such as enhancing the expression of emotion or heating up the atmosphere. However, while the visual storytelling technique is often used in making professional recordings of a live concert, amateur recordings of audiences often lack such storytelling concepts and skills when filming the same event. Thus a versatile system that can perform video mashup to create a refined high-quality video from such amateur clips is desirable. To this end, we aim at translating the music into an attractive shot (type) sequence by learning the relation between music and visual storytelling of shots. The resulting shot sequence can then be used to better portray the visual storytelling of a song and guide the concert video mashup process. To achieve the task, we first introduces a novel probabilistic-based fusion approach, named as multi-resolution fused recurrent neural networks (MF-RNNs) with film-language, which integrates multi-resolution fused RNNs and a film-language model for boosting the translation performance. We then distill the knowledge in MF-RNNs with film-language into a lightweight RNN, which is more efficient and easier to deploy. The results from objective and subjective experiments demonstrate that both MF-RNNs with film-language and lightweight RNN can generate attractive shot sequences for music, thereby enhancing the viewing and listening experience.

An Adaptive Layer Expansion Algorithm for Efficient Training of Deep Neural Networks

IEEE International Conference on Big Data, December 2020

Leo Chen, Pangfeng Liu, and Jan-Jan Wu

Jan-Jan Wu

In this paper, we propose an adaptive layer
expansion algorithm to reduce the training time of deep
neural networks without noticeable loss of accuracy. Neu-
ral networks have become deeper and wider to improve
accuracy. The size of such networks makes them time-
consuming to train. Hence, we propose an adaptive layer
expansion algorithm that reduces training time by dynam-
ically adding nodes in where is necessary, to improve the
training efficiency while not losing accuracy. We start with
a smaller model of only a fraction of parameters of the
original model, then train the network and add nodes to
specific layers determined by the stability of gradients.
The algorithm repeatedly adds nodes until a threshold is
reached, and trains model until the accuracy converges. The
experiment results indicate that our algorithm only uses a
quarter of computation time of a full model, and achieves
64.1% accuracy on MobileNet with dataset CIFAR100,
which is only 2% less than 66.2% accuracy form a full
model. The algorithm stops adding nodes when it has
only half of the parameters of the original model. As
a result, this new model is good for fast inference in
those environments where both the computation power and
memory storage are very limited, such as mobile devices.

Automated Graph Generation at Sentence Level for Reading Comprehension Based on Conceptual Graphs

The 28th International Conference on Computation Linguistics (COLING), December 2020

Wan-Hsuan Lin and Chun-Shien Lu

Wan-Hsuan Lin Chun-Shien Lu

This paper proposes a novel miscellaneous-context-based method to convert a sentence into a
knowledge embedding in the form of a directed graph. We adopt the idea of conceptual graphs
to frame for the miscellaneous textual information into conceptual compactness. We first empirically
observe that this graph representation method can (1) accommodate the slot-filling
challenges in typical question answering and (2) access to the sentence-level graph structure
in order to explicitly capture the neighbouring connections of reference concept nodes. Secondly,
we propose a task-agnostic semantics-measured module, which cooperates with the graph
representation method, in order to (3) project an edge of a sentence-level graph to the space
of semantic relevance with respect to the corresponding concept nodes. As a result of experiments
on the QA-type relation extraction, the combination of the graph representation and the
semantics-measured module achieves the high accuracy of answer prediction and offers humancomprehensible
graphical interpretation for every well-formed sample. To our knowledge, our
approach is the first towards the interpretable process of learning vocabulary representations with
the experimental evidence.

Determinizing Crash Behavior with a Verified Snapshot-Consistent Flash Translation Layer

USENIX Symposium on Operating Systems Design and Implementation (OSDI), November 2020

Yun-Sheng Chang, Yao Hsiao, Tzu-Chi Lin, Che-Wei Tsao, Chun-Feng Wu, Yuan-Hao Chang, Hsiang-Shang Ko, and Yu-Fang Chen

Yun-Sheng Chang Che-Wei Tsao Chun-Feng Wu Yuan-Hao Chang Hsiang-Shang Ko Yu-Fang Chen

We introduce the design of a snapshot-consistent flash translation layer (SCFTL) for flash disks, which has a stronger guarantee about the possible behaviors after a crash than conventional designs. More specifically, the flush operation of SCFTL also has the functionality of making a “disk snapshot.” When a crash occurs, the flash disk is guaranteed to recover to the state right before the last flush. The major benefit of SCFTL is that it allows a more efficient design of upper layers in the storage stack. For example, the file system hosted by SCFTL does not require the use of a journal for crash recovery. Instead, it only needs to perform a flush operation of SCFTL at the end of each atomic transaction. We use a two-layer approach, combining a proof assistant, a symbolic executor, and an SMT solver, to formally verify the correctness of our prototype SCFTL implementation. We optimize the xv6 file system by utilizing SCFTL’s stronger crash guarantee. Evaluation results show that the optimized xv6 is 3 to 30 times faster than the original version.

Cross-batch Reference Learning for Deep Retrieval

IEEE Transactions on Neural Networks and Learning Systems, September 2020

Huei-Fang Yang, Kevin Lin, Ting-Yen Chen, and Chu-Song Chen

Chu-Song Chen

Learning effective representations that exhibit semantic content is crucial to image retrieval applications. Recent advances in deep learning have made significant improvements in performance on a number of visual recognition tasks. Studies have also revealed that visual features extracted from a deep network learned on a large-scale image data set (e.g., ImageNet) for classification are generic and perform well on new recognition tasks in different domains. Nevertheless, when applied to image retrieval, such deep representations do not attain performance as impressive as used for classification. This is mainly because the deep features are optimized for classification rather than for the desired retrieval task. We introduce the cross-batch reference (CBR), a novel training mechanism that enables the optimization of deep networks with a retrieval criterion. With the CBR, the networks leverage both the samples in a single minibatch and the samples in the others for weight updates, enhancing the stochastic gradient descent (SGD) training by enabling interbatch information passing. This interbatch communication is implemented as a cross-batch retrieval process in which the networks are trained to maximize the mean average precision (mAP) that is a popular performance measure in retrieval. Maximizing the cross-batch mAP is equivalent to centralizing the samples relevant to each other in the feature space and separating the samples irrelevant to each other. The learned features can discriminate between relevant and irrelevant samples and thus are suitable for retrieval. To circumvent the discrete, nondifferentiable mAP maximization, we derive an approximate, differentiable lower bound that can be easily optimized in deep networks. Furthermore, the mAP loss can be used alone or with a classification loss. Experiments on several data sets demonstrate that our CBR learning provides favorable performance, validating its effectiveness.

EpiMOLAS: An Intuitive Web-based Framework for Genome-wide DNA Methylation Analysis

BMC Genomics, April 2020

Sheng-Yao Su, I-Hsuan Lu, Wen-Chih Cheng, Wei-Chun Chung, Pao-Yang Chen, Jan-Ming Ho, Shu-Hwa Chen, Chung-Yen Lin

Jan-Ming Ho Chung-Yen Lin


DNA methylation is a crucial epigenomic mechanism in various biological processes. Using whole-genome bisulfite sequencing (WGBS) technology, methylated cytosine sites can be revealed at the single nucleotide level. However, the WGBS data analysis process is usually complicated and challenging.


To alleviate the associated difficulties, we integrated the WGBS data processing steps and downstream analysis into a two-phase approach. First, we set up the required tools in Galaxy and developed workflows to calculate the methylation level from raw WGBS data and generate a methylation status summary, the mtable. This computation environment is wrapped into the Docker container image DocMethyl, which allows users to rapidly deploy an executable environment without tedious software installation and library dependency problems. Next, the mtable files were uploaded to the web server EpiMOLAS_web to link with the gene annotation databases that enable rapid data retrieval and analyses.


To our knowledge, the EpiMOLAS framework, consisting of DocMethyl and EpiMOLAS_web, is the first approach to include containerization technology and a web-based system for WGBS data analysis from raw data processing to downstream analysis. EpiMOLAS will help users cope with their WGBS data and also conduct reproducible analyses of publicly available data, thereby gaining insights into the mechanisms underlying complex biological phenomenon. The Galaxy Docker image DocMethyl is available at https://hub.docker.com/r/lsbnb/docmethyl/.

EpiMOLAS_web is publicly accessible at http://symbiosis.iis.sinica.edu.tw/epimolas/

Piwi Reduction in the Aged Niche Eliminates Germline Stem Cells via Toll-GSK3 Signaling

Nature Communications, June 2020

Kun-Yang Lin, Wen-Der Wang, Chi-Hung Lin, Elham Rastegari, Yu-Han Su, Yi-Chieh Chang, Yu-Tzu Chang, Yung-Feng Liao, Haiwei Pi, Bo-Yi Yu, Shu-Hwa Chen, Chung-Yen Lin, Mei-Yeh Lu, Tsu-Yi Su, Fei-Yang Tzou, Chih-Chiang Chan, and Hwei-jan Hsu

Chung-Yen Lin

Transposons are known to participate in tissue aging, but their effects on aged stem cells remain unclear. Here, we report that in the Drosophila ovarian germline stem cell (GSC) niche, aging-related reductions in expression of Piwi (a transposon silencer) derepress retrotransposons and cause GSC loss. Suppression of Piwi expression in the young niche mimics the aged niche, causing retrotransposon depression and coincident activation of Toll-mediated signaling, which promotes Glycogen synthase kinase 3 activity to degrade β-catenin. Disruption of β-catenin-E-cadherin-mediated GSC anchorage then results in GSC loss. Knocking down gypsy (a highly active retrotransposon) or toll, or inhibiting reverse transcription in the piwi-deficient niche, suppresses GSK3 activity and β-catenin degradation, restoring GSC-niche attachment. This retrotransposon-mediated impairment of aged stem cell maintenance may have relevance in many tissues, and could represent a viable therapeutic target for aging-related tissue degeneration.

Learning to Visualize Music Through Shot Sequence for Automatic Concert Video Mashup

IEEE Transactions on Multimedia, To Appear

W. L. Wei, J. C. Lin, T. L. Liu, H. R. Tyan, H. M. Wang, and H. Y. Mark Liao

Jen-Chun Lin Tyng-Luh Liu Hsin-Min Wang Hong-Yuan Mark Liao

An experienced director usually switches among different types of shots to make visual storytelling more touching. When filming a musical performance, appropriate switching shots can produce some special effects, such as enhancing the expression of emotion or heating up the atmosphere. However, while the visual storytelling technique is often used in making professional recordings of a live concert, amateur recordings of audiences often lack such storytelling concepts and skills when filming the same event. Thus a versatile system that can perform video mashup to create a refined high-quality video from such amateur clips is desirable. To this end, we aim at translating the music into an attractive shot (type) sequence by learning the relation between music and visual storytelling of shots. The resulting shot sequence can then be used to better portray the visual storytelling of a song and guide the concert video mashup process. To achieve the task, we first introduces a novel probabilistic-based fusion approach, named as multi-resolution fused recurrent neural networks (MF-RNNs) with film-language, which integrates multiresolution fused RNNs and a film-language model for boosting the translation performance. We then distill the knowledge in MFRNNs with film-language into a lightweight RNN, which is more efficient and easier to deploy. The results from objective and subjective experiments demonstrate that both MF-RNNs with film-language and lightweight RNN can generate attractive shot sequences for music, thereby enhancing the viewing and listening experience.

Temporally Guided Music-to-Body-Movement Generation

ACM International Conference on Multimedia (ACM MM), October 2020

Hsuan-Kai Kao and Li Su

Li Su

This paper presents a neural network model to generate virtual violinist’s 3-D skeleton movements from music audio. Improved from the conventional recurrent neural network models for generating 2-D skeleton data in previous works, the proposed model incorporates an encoder-decoder architecture, as well as the self-attention mechanism to model the complicated dynamics in body movement sequences. To facilitate the optimization of self-attention model, beat tracking is applied to determine effective sizes and boundaries of the training examples. The decoder is accompanied with a refining network and a bowing attack inference mechanism to emphasize the right-hand behavior and bowing attack timing. Both objective and subjective evaluations reveal that the proposed model outperforms the state-of-the-art methods. To the best of our knowledge, this work represents the first attempt to generate 3-D violinists’ body movements considering key features in musical body movement.

DeepPrefetcher: A Deep Learning Framework for Data Prefetching in Flash Storage Devices

ACM/IEEE International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), September 2020

Gaddisa Olani Ganfure, Chun-Feng Wu, Yuan-Hao Chang, and Wei-Kuan Shih

Gaddisa Olani Ganfure Chun-Feng Wu Yuan-Hao Chang

In today’s data-driven world, applications access to storage device constitutes the high cost of processing a user request. Data prefetching is a technique used to alleviate storage access latency by predicting future data access and initiate a data fetch. However, the block access requests received by the storage device show poor spatial locality because most file-related locality is absorbed in the higher layers of the memory hierarchy, including the CPU cache and main memory. Besides, the utilization of multithreading strategies in today’s applications typically leads to interleaved block accesses, which makes detecting an access pattern at storage level very challenging for the existing prefetching techniques. Towards this, we propose and assess DeepPrefetcher, a novel Deep Neural Network inspired context-aware prefetching method that adapts to arbitrary memory access patterns. Under DeepPrefetcher, we capture block access pattern contexts using distributed representation and leverage Long Short Tem Memory neural architecture for context-aware prediction to improve the effectiveness of data prefetching. Instead of using the logical block address (LBA) value directly, we model the difference between successive access requests, which contains more patterns than LBA value for modeling. By targeting access pattern sequence in this manner, the DeepPrefetcher can learn the vital context from a long input LBA sequence and learn to predict both the previously seen and unseen access patterns. The experiment result reveals that DeepPrefetcher can increase an average prefetch accuracy, coverage, and speedup by 21.5%, 19.5%, and 17.2%, respectively, contrasted with the baseline prefetching strategies. Overall, the proposed prefetching approach performs better than the conventional prefetching studied on all benchmarks, and the results are encouraging.

Index of Cancer-Associated Fibroblasts Is Superior to the Epithelial-Mesenchymal Transition Score in Prognosis Prediction

Cancers, July 2020

Ying-Chieh Ko, Ting-Yu Lai, Shu-Ching Hsu,Fu-Hui Wang, Sheng-Yao Su, Yu-Lian Chen, Min-Lung Tsai, Chung-Chun Wu, Jenn-Ren Hsiao, Jang-Yang Chang, Yi-Mi Wu, Dan R Robinson, Chung-Yen Lin, Su-Fang Lin

Chung-Yen Lin

In many solid tumors, tissue of the mesenchymal subtype is frequently associated with epithelial-mesenchymal transition (EMT), strong stromal infiltration, and poor prognosis. Emerging evidence from tumor ecosystem studies has revealed that the two main components of tumor stroma, namely, infiltrated immune cells and cancer-associated fibroblasts (CAFs), also express certain typical EMT genes and are not distinguishable from intrinsic tumor EMT, where bulk tissue is concerned. Transcriptomic analysis of xenograft tissues provides a unique advantage in dissecting genes of tumor (human) or stroma (murine) origins. By transcriptomic analysis of xenograft tissues, we found that oral squamous cell carcinoma (OSCC) tumor cells with a high EMT score, the computed mesenchymal likelihood based on the expression signature of canonical EMT markers, are associated with elevated stromal contents featured with fibronectin 1 (Fn1) and transforming growth factor-β (Tgfβ) axis gene expression. In conjugation with meta-analysis of these genes in clinical OSCC datasets, we further extracted a four-gene index, comprising FN1, TGFB2, TGFBR2, and TGFBI, as an indicator of CAF abundance. The CAF index is more powerful than the EMT score in predicting survival outcomes, not only for oral cancer but also for the cancer genome atlas (TCGA) pan-cancer cohort comprising 9356 patients from 32 cancer subtypes. Collectively, our results suggest that a further distinction and integration of the EMT score with the CAF index will enhance prognosis prediction, thus paving the way for curative medicine in clinical oncology.

How to Cultivate a Green Decision Tree without Loss of Accuracy?

ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), August 2020

Tseng-Yi Chen, Yuan-Hao Chang, Ming-Chang Yang, and Huang-Wei Chen

Tseng-Yi Chen Yuan-Hao Chang Ming-Chang Yang

that has been widely applied to classification and regression problems in the machine learning field. For avoiding underfitting, a decision tree algorithm will stop growing its tree model when the model is a fully-grown tree. However, a fully-grown tree will result in an overfitting problem reducing the accuracy of a decision tree. In such a dilemma, some post-pruning strategies have been proposed to reduce the model complexity of the fully-grown decision tree. Nevertheless, such a process is very energy-inefficiency over an non-volatile-memory-based (NVM-based) system because NVM generally have high writing costs (i.e., energy consumption and I/O latency). In other words, the nodes which will be pruned in the post-pruning process are redundant data. Such unnecessary data will induce high writing energy consumption and long I/O latency onNVM-based architectures, especially for low-power-oriented embedded systems. In order to establish a green decision tree (i.e., a tree model with minimized construction energy consumption), this study rethinks a pruning algorithm, namely duo-phase pruning framework, which can significantly decrease the energy consumption on the NVM-based computing system without loss of accuracy.

How to Cut Out Expired Data with Nearly Zero Overhead for Solid-State Drives

ACM/IEEE Design Automation Conference (DAC), July 2020

Wei-Lin Wang, Tseng-Yi Chen, Yuan-Hao Chang, Hsin-Wen Wei, and Wei-Kuan Shih

Wei-Lin Wang Tseng-Yi Chen Yuan-Hao Chang

Modern flash memory always encounters the issues of huge performance overhead caused by garbage collection process. The most effective solution for minimizing garbage collection overhead is to lower the number of live pages in a flash block. However, current garbage collection strategies will copy all live pages in a to-be-erased flash block to another flash block even through some live pages will no longer be accessed. This is because present flash translation layer (FTL) designs cannot identify disused data from valid pages. In other words, if written data has the lifetime information, the problem can be resolved. Fortunately, an emerging write technology, also known as multi-streamed write technology, can bring additional information (e.g., data lifetime) from host-side system to flash memory device. By such observations, this work propose a dual-time referencing FTL (DTR-FTL) design to deal with disused data and minimize the overhead of garbage collection by referring to data lifetime information and block retention time. Moreover, as the DTR-FTL can store written data to appropriate flash block in the very first beginning, flash lifespan is also extremely lengthened by our proposed design. According to the experimental results, the overhead of live-page copying has been significantly reduced and the flash lifespan has been unbelievably prolonged by the DTR-FTL.

DSTL: A Demand-based Shingled Translation Layer for Enabling Adaptive Address Mapping on SMR Drives

ACM Transactions on Embedded Computing Systems (TECS), July 2020

Yi-Jing Chuang, Shuo-Han Chen, Yuan-Hao Chang, Yu-Pei Liang, Hsin-Wen Wei, and Wei-Kuan Shih

Shuo-Han Chen Yuan-Hao Chang

Shingled magnetic recording (SMR) is regarded as a promising technology for resolving the areal density limitation of conventional magnetic recording hard disk drives. Among different types of SMR drives, drivemanaged SMR (DM-SMR) requires no changes on the host software and is widely used in today’s consumer market. DM-SMR employs a shingled translation layer (STL) to hide its inherent sequential-write constraint from the host software and emulate the SMR drive as a block device via maintaining logical to physical block address mapping entries. However, because most existing STL designs do not simultaneously consider the access pattern and the data update frequency of incoming workloads, those mapping entries maintained within the STL cannot be effectively managed, thus inducing unnecessary performance overhead. To resolve< the inefficiency of existing STL designs, this article proposes a demand-based STL (DSTL) to simultaneously consider the access pattern and update frequency of incoming data streams to enhance the access performance of DM-SMR. The proposed design was evaluated by a series of experiments, and the results show that the proposed DSTL can outperform other SMR management approach by up to 86.69% in terms of read/write performance.

Proteogenomics of non-smoking lung cancer in East Asia delineates molecular signatures of pathogenesis and progression

Cell, July 2020

Yi-Ju Chen, Theodoros I Roumeliotis, Ya-Hsuan Chang, Ching-Tai Chen, Chia-Li Han*, Miao-Hsia Lin, Huei-Wen Chen, Gee-Chen Chang, Yih-Leong Chang, Chen-Tu Wu, Mong-Wei Lin, Min-Shu Hsieh, Yu-Tai Wang, Yet-Ran Chen, Inge Jonassen, Fatemeh Zamanzad Ghavidel, Ze-Shiang Lin, Kuen-Tyng Lin, Ching-Wen Chen, Pei-Yuan Sheu, Chen-Ting Hung, Ke-Chieh Huang, Hao-Chin Yang, Pei-Yi Lin, Ta-Chi Yen, Yi-Wei Lin, Jen-Hung Wang, Lovely Raghav, Chien-Yu Lin, Yan-Si Chen, Pei-Shan Wu, Chi-Ting Lai, Shao-Hsing Weng, Kang-Yi Su, Wei-Hung Chang, Pang-Yan Tsai, Ana I Robles, Henry Rodriguez, Yi-Jing Hsiao, Wen-Hsin Chang, Ting-Yi Sung*, Jin-Shing Chen*, Sung-Liang Yu*, Jyoti S Choudhary*, Hsuan-Yu Chen*, Pan-Chyr Yang*, and Yu-Ju Chen*

Ching-Tai Chen Jen-Hung Wang Ting-Yi Sung

Lung cancer in East Asia is characterized by a high percentage of never-smokers, early onset and predominantEGFR mutations. To illuminate the molecular phenotype of this demographically distinct disease, we performed a deep ㄏcomprehensive proteogenomic study on a prospectively collected cohort in Taiwan, representing early stage, predominantly female, non-smoking lung adenocarcinoma. Integrated genomic, proteomic, and phosphoproteomic analysis delineated the demographically distinct molecular attributes and hallmarks of tumor progression. Mutational signature analysis revealed age- and gender-related mutagenesis mechanisms, characterized by high prevalence of APOBEC mutational signature in younger females and over-representation of environmental carcinogen-like mutational signatures in older females. A proteomics-informed classification distinguished the clinical characteristics of early stage patients with EGFR mutations. Furthermore, integrated protein network analysis revealed the cellular remodeling underpinning clinical trajectories and nominated candidate biomarkers for patient stratification and therapeutic intervention. This multi-omic molecular architecture may help develop strategies for management of early stage never-smoker lung adenocarcinoma.

MVIN: Learning multiview items for recommendation

the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020), July 2020

Chang-You Tai, Meng-Ru Wu, Yun-Wei Chu, Shao-Yu Chu and Lun-Wei Ku

Chang-You Tai Meng-Ru Wu Yun-Wei Chu Lun-Wei Ku

Researchers have begun to utilize heterogeneous knowledge graphs (KGs) as auxiliary information in recommendation systems to mitigate the cold start and sparsity issues. However, utilizing a graph neural network (GNN) to capture information in KG and further apply in RS is still problematic as it is unable to see each item’s properties from multiple perspectives. To address these issues, we propose the multi-view item network (MVIN), a GNN-based recommendation model which provides superior recommendations by describing items from a unique mixed view from user and entity angles. MVIN learns item representations from both the user view and the entity view. From the user view, user-oriented modules score and aggregate features to make recommendations from a personalized perspective constructed according to KG entities which incorporates user click information. From the entity view, the mixing layer contrasts layer-wise GCN information to further obtain comprehensive features from internal entity-entity interactions in the KG. We evaluate MVIN on three real-world datasets: MovieLens-1M (ML-1M), LFM-1b 2015 (LFM-1b), and Amazon-Book (AZ-book). Results show that MVIN significantly outperforms state-of-the-art methods on these three datasets. In addition, from user-view cases, we find that MVIN indeed captures entities that attract users. Figures further illustrate that mixing layers in a heterogeneous KG plays a vital role in neighborhood information aggregation.

Biomedical Named Entity Recognition and Linking Datasets: Survey and Our Recent Development

Briefings in Bioinformatics, July 2020

Ming-Siang Huang, Po-Ting Lai, Pei-Yen Lin, Yu-Ting You, Richard Tzong-Han Tsai and Wen-Lian Hsu

Wen-Lian Hsu

Natural language processing (NLP) is widely applied in biological domains to retrieve information from publications. Systems to address numerous applications exist, such as biomedical named entity recognition (BNER), named entity normalization (NEN) and protein–protein interaction extraction (PPIE). High-quality datasets can assist the development of robust and reliable systems; however, due to the endless applications and evolving techniques, the annotations of benchmark datasets may become outdated and inappropriate. In this study, we first review commonlyused BNER datasets and their potential annotation problems such as inconsistency and low portability. Then, we introduce a revised version of the JNLPBA dataset that solves potential problems in the original and use state-of-the-art named entity recognition systems to evaluate its portability to different kinds of biomedical literature, including protein–protein interaction and biology events. Lastly, we introduce an ensembled biomedical entity dataset (EBED) by extending the revised JNLPBA dataset with PubMed Central full-text paragraphs, figure captions and patent abstracts. This EBED is a multi-task dataset that covers annotations including gene, disease and chemical entities. In total, it contains 85000 entity mentions, 25000 entity mentions with database identifiers and 5000 attribute tags. To demonstrate the usage of the EBED, we review the BNER track from the AI CUP Biomedical Paper Analysis challenge. Availability: The revised JNLPBA dataset is available at https://iasl-btm.iis.sinica.edu.tw/BNER/Content/Re vised_JNLPBA.zip. The EBED dataset is available at https://iasl-btm.iis.sinica.edu.tw/BNER/Content/AICUP _EBED_dataset.rar. Contact: Email: thtsai@g.ncu.edu.tw, Tel. 886-3-4227151 ext. 35203, Fax: 886-3-422-2681 Email: hsu@iis.sinica.edu.tw, Tel. 886-2-2788-3799 ext. 2211, Fax: 886-2-2782-4814. Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.

Request Flow Coordination for Growing-Scale Solid-State Drives

IEEE Transactions on Computers (TC), June 2020

Ming-Chang Yang, Yuan-Hao Chang, Tei-Wei Kuo, and Chun-Feng Wu

Ming-Chang Yang Yuan-Hao Chang Chun-Feng Wu

With the emerge of high-density triple-level-cell (TLC) and 3D NAND flash, the access performance and endurance of flash devices are degraded due to the downscaling of flash cells. In addition, we observe that the mismatch between data lifetime requirement and flash block retention capability could further worsen the access performance and endurance. This is because the ¨lifetime-retention mismatch〃 could result in massive internal data migrations during garbage collection and data refreshing, and further aggravate the already-worsened access performance and endurance of high-density NAND flash devices. Such an observation motivates us to resolve the lifetime-retention mismatch problem by proposing a ¨time harmonization strategy〃, which coordinates the flash block retention capability with the data lifetime requirement to enhance the performance of flash devices with very limited endurance degradation. Specifically, this study aims to lower the amount of internal data migrations caused by garbage collection and data refreshing via storing data of different lifetime requirement in flash blocks with suitable retention capability. The trace-driven evaluation results reveal that the proposed design can effectively reduce the internal data migrations by about 33% on average with nearly no degradation on the overall endurance, as compared with the state-of-the-art designs.

Beyond Address Mapping: A User-Oriented Multi-Regional Space Management Design for 3D NAND Flash Memory

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), June 2020

Shuo-Han Chen, Che-Wei Tsao, and Yuan-Hao Chang

Shuo-Han Chen Che-Wei Tsao Yuan-Hao Chang

Due to the ever-growing demands of larger capacity of flash storage devices, various new manufacturing techniques have been proposed to provide high-density and large-capacity NAND flash devices. Among these new techniques, 3D NAND flash is regarded as one of the most promising candidates for the next-generation flash storage devices. 3D NAND flash brings high bit density and significant cost saving via stacking memory cells vertically. However, the read/write and erase units of 3D NAND flash also grows larger than those of traditional planner flash devices. This growing trend of read/write and erase units for 3D NAND flash imposes significant management difficulties, such as the grown size of mapping information, decreased garbage collection efficiency, and worsened write amplification issue. To alleviate these negative impacts of the growing read/write and erase units, this paper proposes a multi-regional space management design to achieve subpage-level management while adaptively adjusting mapping granularity by considering the user behaviors. The proposed design was evaluated by a series of experiments, and results show that the access performance can be improved by 64%.

Joint Management of CPU and NVDIMM for Breaking Down the Great Memory Wall

IEEE Transactions on Computers (TC), May 2020

Chun-Feng Wu, Yuan-Hao Chang, Ming-Chang Yang, and Tei-Wei Kuo

Chun-Feng Wu Yuan-Hao Chang Ming-Chang Yang

To provide larger memory space with lower costs, NVDIMM is a production-ready device. However, directly placing NVDIMM as the main memory would seriously degrade the system performance because of the ``great memory wall'' caused by the fact that in NVDIMM, the slow memory (e.g., flash memory) is several orders of magnitude slower than the fast memory (e.g., DRAM). In this paper, we present a joint management framework of host/CPU and NVDIMM to break down the great memory wall by bridging the process information gap between host/CPU and NVDIMM. In this framework, a page semantic-aware strategy is proposed to precisely predict, mark, and relocate data or memory pages to the fast memory in advance by exploiting the process access patterns, so that the frequency of the slow memory accesses can be further reduced. The proposed framework with the proposed strategy was evaluated with several well-known benchmarks and the results are encouraging.

YOLOv4: Optimal Speed and Accuracy of Object Detection

arXiv:2004.10934vl, April 2020

Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao

Chien-Yao Wang Hong-Yuan Mark Liao

There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Somefeaturesoperateoncertainmodelsexclusively andforcertainproblemsexclusively,oronlyforsmall-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets. We assume that such universal features include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation. We use new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN,DropBlockregularization,andCIoUloss,andcombinesomeofthemtoachievestate-of-the-artresults: 43.5% AP (65.7% AP50) for the MS COCO dataset at a realtime speed of ∼65 FPS on Tesla V100. Source code is at https://github.com/AlexeyAB/darknet.

MinProtMaxVP: Generating a minimized number of protein variant sequences containing all possible variant peptides for proteogenomic analysis

Journal of Proteomics, July 2020

Wai-Kok Choong, Jen-Hung Wang, Ting-Yi Sung

Wai-Kok Choong Jen-Hung Wang Ting-Yi Sung

Identifying single-amino-acid variants (SAVs) from mass spectrometry-based experiments is critical for validating single-nucleotide variants (SNVs) at the protein level to facilitate biomedical research. Currently, two approaches are usually applied to convert SNV annotations into SAV-harboring protein sequences. One approach generates one sequence containing exactly one SAV, and the other all SAVs. However, they may neglect the possibility of SAV combinations, e.g., haplotypes, existing in bio-samples. Therefore, it is necessary to consider all SAV combinations of a protein when generating SAV-harboring protein sequences. In this paper, we propose MinProtMaxVP, a novel approach which selects a minimized number of SAV-harboring protein sequences generated from the exhaustive approach, while still accommodating all possible variant peptides, by solving a classic set covering problem. Our study on known haplotype variations of TAS2R38 justifies the necessity for MinProtMaxVP to consider all combinations of SAVs. The performance of MinProtMaxVP is demonstrated by an in silico study on OR2T27 with five SAVs and real experimental data of the HEK293 cell line. Furthermore, assuming simulated somatic and germline variants of OR2T27 in tumor and normal tissues demonstrates that when adopting the appropriate somatic and germline SAV integration strategy, MinProtMaxVP is adaptable to labeling and label-free mass spectrometry-based experiments.