An experienced director usually switches among different types of shots to make visual storytelling more touching. When filming a musical performance, appropriate switching shots can produce some special effects, such as enhancing the expression of emotion or heating up the atmosphere. However, while the visual storytelling technique is often used in making professional recordings of a live concert, amateur recordings of audiences often lack such storytelling concepts and skills when filming the same event. Thus a versatile system that can perform video mashup to create a refined high-quality video from such amateur clips is desirable. To this end, we aim at translating the music into an attractive shot (type) sequence by learning the relation between music and visual storytelling of shots. The resulting shot sequence can then be used to better portray the visual storytelling of a song and guide the concert video mashup process. To achieve the task, we first introduces a novel probabilistic-based fusion approach, named as multi-resolution fused recurrent neural networks (MF-RNNs) with film-language, which integrates multi-resolution fused RNNs and a film-language model for boosting the translation performance. We then distill the knowledge in MF-RNNs with film-language into a lightweight RNN, which is more efficient and easier to deploy. The results from objective and subjective experiments demonstrate that both MF-RNNs with film-language and lightweight RNN can generate attractive shot sequences for music, thereby enhancing the viewing and listening experience.
In this paper, we propose an adaptive layer
expansion algorithm to reduce the training time of deep
neural networks without noticeable loss of accuracy. Neu-
ral networks have become deeper and wider to improve
accuracy. The size of such networks makes them time-
consuming to train. Hence, we propose an adaptive layer
expansion algorithm that reduces training time by dynam-
ically adding nodes in where is necessary, to improve the
training efficiency while not losing accuracy. We start with
a smaller model of only a fraction of parameters of the
original model, then train the network and add nodes to
specific layers determined by the stability of gradients.
The algorithm repeatedly adds nodes until a threshold is
reached, and trains model until the accuracy converges. The
experiment results indicate that our algorithm only uses a
quarter of computation time of a full model, and achieves
64.1% accuracy on MobileNet with dataset CIFAR100,
which is only 2% less than 66.2% accuracy form a full
model. The algorithm stops adding nodes when it has
only half of the parameters of the original model. As
a result, this new model is good for fast inference in
those environments where both the computation power and
memory storage are very limited, such as mobile devices.
We introduce the design of a snapshot-consistent flash translation layer (SCFTL) for flash disks, which has a stronger guarantee about the possible behaviors after a crash than conventional designs. More specifically, the flush operation of SCFTL also has the functionality of making a “disk snapshot.” When a crash occurs, the flash disk is guaranteed to recover to the state right before the last flush. The major benefit of SCFTL is that it allows a more efficient design of upper layers in the storage stack. For example, the file system hosted by SCFTL does not require the use of a journal for crash recovery. Instead, it only needs to perform a flush operation of SCFTL at the end of each atomic transaction. We use a two-layer approach, combining a proof assistant, a symbolic executor, and an SMT solver, to formally verify the correctness of our prototype SCFTL implementation. We optimize the xv6 file system by utilizing SCFTL’s stronger crash guarantee. Evaluation results show that the optimized xv6 is 3 to 30 times faster than the original version.
Learning effective representations that exhibit semantic content is crucial to image retrieval applications. Recent advances in deep learning have made significant improvements in performance on a number of visual recognition tasks. Studies have also revealed that visual features extracted from a deep network learned on a large-scale image data set (e.g., ImageNet) for classification are generic and perform well on new recognition tasks in different domains. Nevertheless, when applied to image retrieval, such deep representations do not attain performance as impressive as used for classification. This is mainly because the deep features are optimized for classification rather than for the desired retrieval task. We introduce the cross-batch reference (CBR), a novel training mechanism that enables the optimization of deep networks with a retrieval criterion. With the CBR, the networks leverage both the samples in a single minibatch and the samples in the others for weight updates, enhancing the stochastic gradient descent (SGD) training by enabling interbatch information passing. This interbatch communication is implemented as a cross-batch retrieval process in which the networks are trained to maximize the mean average precision (mAP) that is a popular performance measure in retrieval. Maximizing the cross-batch mAP is equivalent to centralizing the samples relevant to each other in the feature space and separating the samples irrelevant to each other. The learned features can discriminate between relevant and irrelevant samples and thus are suitable for retrieval. To circumvent the discrete, nondifferentiable mAP maximization, we derive an approximate, differentiable lower bound that can be easily optimized in deep networks. Furthermore, the mAP loss can be used alone or with a classification loss. Experiments on several data sets demonstrate that our CBR learning provides favorable performance, validating its effectiveness.
DNA methylation is a crucial epigenomic mechanism in various biological processes. Using whole-genome bisulfite sequencing (WGBS) technology, methylated cytosine sites can be revealed at the single nucleotide level. However, the WGBS data analysis process is usually complicated and challenging.
To alleviate the associated difficulties, we integrated the WGBS data processing steps and downstream analysis into a two-phase approach. First, we set up the required tools in Galaxy and developed workflows to calculate the methylation level from raw WGBS data and generate a methylation status summary, the mtable. This computation environment is wrapped into the Docker container image DocMethyl, which allows users to rapidly deploy an executable environment without tedious software installation and library dependency problems. Next, the mtable files were uploaded to the web server EpiMOLAS_web to link with the gene annotation databases that enable rapid data retrieval and analyses.
To our knowledge, the EpiMOLAS framework, consisting of DocMethyl and EpiMOLAS_web, is the first approach to include containerization technology and a web-based system for WGBS data analysis from raw data processing to downstream analysis. EpiMOLAS will help users cope with their WGBS data and also conduct reproducible analyses of publicly available data, thereby gaining insights into the mechanisms underlying complex biological phenomenon. The Galaxy Docker image DocMethyl is available at https://hub.docker.com/r/lsbnb/docmethyl/.
EpiMOLAS_web is publicly accessible at http://symbiosis.iis.sinica.edu.tw/epimolas/
Transposons are known to participate in tissue aging, but their effects on aged stem cells remain unclear. Here, we report that in the Drosophila ovarian germline stem cell (GSC) niche, aging-related reductions in expression of Piwi (a transposon silencer) derepress retrotransposons and cause GSC loss. Suppression of Piwi expression in the young niche mimics the aged niche, causing retrotransposon depression and coincident activation of Toll-mediated signaling, which promotes Glycogen synthase kinase 3 activity to degrade β-catenin. Disruption of β-catenin-E-cadherin-mediated GSC anchorage then results in GSC loss. Knocking down gypsy (a highly active retrotransposon) or toll, or inhibiting reverse transcription in the piwi-deficient niche, suppresses GSK3 activity and β-catenin degradation, restoring GSC-niche attachment. This retrotransposon-mediated impairment of aged stem cell maintenance may have relevance in many tissues, and could represent a viable therapeutic target for aging-related tissue degeneration.
An experienced director usually switches among different types of shots to make visual storytelling more touching. When filming a musical performance, appropriate switching shots can produce some special effects, such as enhancing the expression of emotion or heating up the atmosphere. However, while the visual storytelling technique is often used in making professional recordings of a live concert, amateur recordings of audiences often lack such storytelling concepts and skills when filming the same event. Thus a versatile system that can perform video mashup to create a refined high-quality video from such amateur clips is desirable. To this end, we aim at translating the music into an attractive shot (type) sequence by learning the relation between music and visual storytelling of shots. The resulting shot sequence can then be used to better portray the visual storytelling of a song and guide the concert video mashup process. To achieve the task, we first introduces a novel probabilistic-based fusion approach, named as multi-resolution fused recurrent neural networks (MF-RNNs) with film-language, which integrates multiresolution fused RNNs and a film-language model for boosting the translation performance. We then distill the knowledge in MFRNNs with film-language into a lightweight RNN, which is more efficient and easier to deploy. The results from objective and subjective experiments demonstrate that both MF-RNNs with film-language and lightweight RNN can generate attractive shot sequences for music, thereby enhancing the viewing and listening experience.
This paper presents a neural network model to generate virtual violinist’s 3-D skeleton movements from music audio. Improved from the conventional recurrent neural network models for generating 2-D skeleton data in previous works, the proposed model incorporates an encoder-decoder architecture, as well as the self-attention mechanism to model the complicated dynamics in body movement sequences. To facilitate the optimization of self-attention model, beat tracking is applied to determine effective sizes and boundaries of the training examples. The decoder is accompanied with a refining network and a bowing attack inference mechanism to emphasize the right-hand behavior and bowing attack timing. Both objective and subjective evaluations reveal that the proposed model outperforms the state-of-the-art methods. To the best of our knowledge, this work represents the first attempt to generate 3-D violinists’ body movements considering key features in musical body movement.
In today’s data-driven world, applications access to storage device constitutes the high cost of processing a user request. Data prefetching is a technique used to alleviate storage access latency by predicting future data access and initiate a data fetch. However, the block access requests received by the storage device show poor spatial locality because most file-related locality is absorbed in the higher layers of the memory hierarchy, including the CPU cache and main memory. Besides, the utilization of multithreading strategies in today’s applications typically leads to interleaved block accesses, which makes detecting an access pattern at storage level very challenging for the existing prefetching techniques. Towards this, we propose and assess DeepPrefetcher, a novel Deep Neural Network inspired context-aware prefetching method that adapts to arbitrary memory access patterns. Under DeepPrefetcher, we capture block access pattern contexts using distributed representation and leverage Long Short Tem Memory neural architecture for context-aware prediction to improve the effectiveness of data prefetching. Instead of using the logical block address (LBA) value directly, we model the difference between successive access requests, which contains more patterns than LBA value for modeling. By targeting access pattern sequence in this manner, the DeepPrefetcher can learn the vital context from a long input LBA sequence and learn to predict both the previously seen and unseen access patterns. The experiment result reveals that DeepPrefetcher can increase an average prefetch accuracy, coverage, and speedup by 21.5%, 19.5%, and 17.2%, respectively, contrasted with the baseline prefetching strategies. Overall, the proposed prefetching approach performs better than the conventional prefetching studied on all benchmarks, and the results are encouraging.
In many solid tumors, tissue of the mesenchymal subtype is frequently associated with epithelial-mesenchymal transition (EMT), strong stromal infiltration, and poor prognosis. Emerging evidence from tumor ecosystem studies has revealed that the two main components of tumor stroma, namely, infiltrated immune cells and cancer-associated fibroblasts (CAFs), also express certain typical EMT genes and are not distinguishable from intrinsic tumor EMT, where bulk tissue is concerned. Transcriptomic analysis of xenograft tissues provides a unique advantage in dissecting genes of tumor (human) or stroma (murine) origins. By transcriptomic analysis of xenograft tissues, we found that oral squamous cell carcinoma (OSCC) tumor cells with a high EMT score, the computed mesenchymal likelihood based on the expression signature of canonical EMT markers, are associated with elevated stromal contents featured with fibronectin 1 (Fn1) and transforming growth factor-β (Tgfβ) axis gene expression. In conjugation with meta-analysis of these genes in clinical OSCC datasets, we further extracted a four-gene index, comprising FN1, TGFB2, TGFBR2, and TGFBI, as an indicator of CAF abundance. The CAF index is more powerful than the EMT score in predicting survival outcomes, not only for oral cancer but also for the cancer genome atlas (TCGA) pan-cancer cohort comprising 9356 patients from 32 cancer subtypes. Collectively, our results suggest that a further distinction and integration of the EMT score with the CAF index will enhance prognosis prediction, thus paving the way for curative medicine in clinical oncology.
that has been widely applied to classification and regression problems in the machine learning field. For avoiding underfitting, a decision tree algorithm will stop growing its tree model when the model is a fully-grown tree. However, a fully-grown tree will result in an overfitting problem reducing the accuracy of a decision tree. In such a dilemma, some post-pruning strategies have been proposed to reduce the model complexity of the fully-grown decision tree. Nevertheless, such a process is very energy-inefficiency over an non-volatile-memory-based (NVM-based) system because NVM generally have high writing costs (i.e., energy consumption and I/O latency). In other words, the nodes which will be pruned in the post-pruning process are redundant data. Such unnecessary data will induce high writing energy consumption and long I/O latency onNVM-based architectures, especially for low-power-oriented embedded systems. In order to establish a green decision tree (i.e., a tree model with minimized construction energy consumption), this study rethinks a pruning algorithm, namely duo-phase pruning framework, which can significantly decrease the energy consumption on the NVM-based computing system without loss of accuracy.
Modern flash memory always encounters the issues of huge performance overhead caused by garbage collection process. The most effective solution for minimizing garbage collection overhead is to lower the number of live pages in a flash block. However, current garbage collection strategies will copy all live pages in a to-be-erased flash block to another flash block even through some live pages will no longer be accessed. This is because present flash translation layer (FTL) designs cannot identify disused data from valid pages. In other words, if written data has the lifetime information, the problem can be resolved. Fortunately, an emerging write technology, also known as multi-streamed write technology, can bring additional information (e.g., data lifetime) from host-side system to flash memory device. By such observations, this work propose a dual-time referencing FTL (DTR-FTL) design to deal with disused data and minimize the overhead of garbage collection by referring to data lifetime information and block retention time. Moreover, as the DTR-FTL can store written data to appropriate flash block in the very first beginning, flash lifespan is also extremely lengthened by our proposed design. According to the experimental results, the overhead of live-page copying has been significantly reduced and the flash lifespan has been unbelievably prolonged by the DTR-FTL.
Shingled magnetic recording (SMR) is regarded as a promising technology for resolving the areal density limitation of conventional magnetic recording hard disk drives. Among different types of SMR drives, drivemanaged SMR (DM-SMR) requires no changes on the host software and is widely used in today’s consumer market. DM-SMR employs a shingled translation layer (STL) to hide its inherent sequential-write constraint from the host software and emulate the SMR drive as a block device via maintaining logical to physical block address mapping entries. However, because most existing STL designs do not simultaneously consider the access pattern and the data update frequency of incoming workloads, those mapping entries maintained within the STL cannot be effectively managed, thus inducing unnecessary performance overhead. To resolve< the inefficiency of existing STL designs, this article proposes a demand-based STL (DSTL) to simultaneously consider the access pattern and update frequency of incoming data streams to enhance the access performance of DM-SMR. The proposed design was evaluated by a series of experiments, and the results show that the proposed DSTL can outperform other SMR management approach by up to 86.69% in terms of read/write performance.
Lung cancer in East Asia is characterized by a high percentage of never-smokers, early onset and predominantEGFR mutations. To illuminate the molecular phenotype of this demographically distinct disease, we performed a deep ㄏcomprehensive proteogenomic study on a prospectively collected cohort in Taiwan, representing early stage, predominantly female, non-smoking lung adenocarcinoma. Integrated genomic, proteomic, and phosphoproteomic analysis delineated the demographically distinct molecular attributes and hallmarks of tumor progression. Mutational signature analysis revealed age- and gender-related mutagenesis mechanisms, characterized by high prevalence of APOBEC mutational signature in younger females and over-representation of environmental carcinogen-like mutational signatures in older females. A proteomics-informed classification distinguished the clinical characteristics of early stage patients with EGFR mutations. Furthermore, integrated protein network analysis revealed the cellular remodeling underpinning clinical trajectories and nominated candidate biomarkers for patient stratification and therapeutic intervention. This multi-omic molecular architecture may help develop strategies for management of early stage never-smoker lung adenocarcinoma.
Researchers have begun to utilize heterogeneous knowledge graphs (KGs) as auxiliary information in recommendation systems to mitigate the cold start and sparsity issues. However, utilizing a graph neural network (GNN) to capture information in KG and further apply in RS is still problematic as it is unable to see each item’s properties from multiple perspectives. To address these issues, we propose the multi-view item network (MVIN), a GNN-based recommendation model which provides superior recommendations by describing items from a unique mixed view from user and entity angles. MVIN learns item representations from both the user view and the entity view. From the user view, user-oriented modules score and aggregate features to make recommendations from a personalized perspective constructed according to KG entities which incorporates user click information. From the entity view, the mixing layer contrasts layer-wise GCN information to further obtain comprehensive features from internal entity-entity interactions in the KG. We evaluate MVIN on three real-world datasets: MovieLens-1M (ML-1M), LFM-1b 2015 (LFM-1b), and Amazon-Book (AZ-book). Results show that MVIN significantly outperforms state-of-the-art methods on these three datasets. In addition, from user-view cases, we find that MVIN indeed captures entities that attract users. Figures further illustrate that mixing layers in a heterogeneous KG plays a vital role in neighborhood information aggregation.
Natural language processing (NLP) is widely applied in biological domains to retrieve information from publications. Systems to address numerous applications exist, such as biomedical named entity recognition (BNER), named entity normalization (NEN) and protein–protein interaction extraction (PPIE). High-quality datasets can assist the development of robust and reliable systems; however, due to the endless applications and evolving techniques, the annotations of benchmark datasets may become outdated and inappropriate. In this study, we first review commonlyused BNER datasets and their potential annotation problems such as inconsistency and low portability. Then, we introduce a revised version of the JNLPBA dataset that solves potential problems in the original and use state-of-the-art named entity recognition systems to evaluate its portability to different kinds of biomedical literature, including protein–protein interaction and biology events. Lastly, we introduce an ensembled biomedical entity dataset (EBED) by extending the revised JNLPBA dataset with PubMed Central full-text paragraphs, figure captions and patent abstracts. This EBED is a multi-task dataset that covers annotations including gene, disease and chemical entities. In total, it contains 85000 entity mentions, 25000 entity mentions with database identifiers and 5000 attribute tags. To demonstrate the usage of the EBED, we review the BNER track from the AI CUP Biomedical Paper Analysis challenge. Availability: The revised JNLPBA dataset is available at https://iasl-btm.iis.sinica.edu.tw/BNER/Content/Re vised_JNLPBA.zip. The EBED dataset is available at https://iasl-btm.iis.sinica.edu.tw/BNER/Content/AICUP _EBED_dataset.rar. Contact: Email: email@example.com, Tel. 886-3-4227151 ext. 35203, Fax: 886-3-422-2681 Email: firstname.lastname@example.org, Tel. 886-2-2788-3799 ext. 2211, Fax: 886-2-2782-4814. Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.
With the emerge of high-density triple-level-cell (TLC) and 3D NAND flash, the access performance and endurance of flash devices are degraded due to the downscaling of flash cells. In addition, we observe that the mismatch between data lifetime requirement and flash block retention capability could further worsen the access performance and endurance. This is because the ¨lifetime-retention mismatch〃 could result in massive internal data migrations during garbage collection and data refreshing, and further aggravate the already-worsened access performance and endurance of high-density NAND flash devices. Such an observation motivates us to resolve the lifetime-retention mismatch problem by proposing a ¨time harmonization strategy〃, which coordinates the flash block retention capability with the data lifetime requirement to enhance the performance of flash devices with very limited endurance degradation. Specifically, this study aims to lower the amount of internal data migrations caused by garbage collection and data refreshing via storing data of different lifetime requirement in flash blocks with suitable retention capability. The trace-driven evaluation results reveal that the proposed design can effectively reduce the internal data migrations by about 33% on average with nearly no degradation on the overall endurance, as compared with the state-of-the-art designs.
Due to the ever-growing demands of larger capacity of flash storage devices, various new manufacturing techniques have been proposed to provide high-density and large-capacity NAND flash devices. Among these new techniques, 3D NAND flash is regarded as one of the most promising candidates for the next-generation flash storage devices. 3D NAND flash brings high bit density and significant cost saving via stacking memory cells vertically. However, the read/write and erase units of 3D NAND flash also grows larger than those of traditional planner flash devices. This growing trend of read/write and erase units for 3D NAND flash imposes significant management difficulties, such as the grown size of mapping information, decreased garbage collection efficiency, and worsened write amplification issue. To alleviate these negative impacts of the growing read/write and erase units, this paper proposes a multi-regional space management design to achieve subpage-level management while adaptively adjusting mapping granularity by considering the user behaviors. The proposed design was evaluated by a series of experiments, and results show that the access performance can be improved by 64%.
To provide larger memory space with lower costs, NVDIMM is a production-ready device. However, directly placing NVDIMM as the main memory would seriously degrade the system performance because of the ``great memory wall'' caused by the fact that in NVDIMM, the slow memory (e.g., flash memory) is several orders of magnitude slower than the fast memory (e.g., DRAM). In this paper, we present a joint management framework of host/CPU and NVDIMM to break down the great memory wall by bridging the process information gap between host/CPU and NVDIMM. In this framework, a page semantic-aware strategy is proposed to precisely predict, mark, and relocate data or memory pages to the fast memory in advance by exploiting the process access patterns, so that the frequency of the slow memory accesses can be further reduced. The proposed framework with the proposed strategy was evaluated with several well-known benchmarks and the results are encouraging.
There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justiﬁcation of the result, is required. Somefeaturesoperateoncertainmodelsexclusively andforcertainproblemsexclusively,oronlyforsmall-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets. We assume that such universal features include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation. We use new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN,DropBlockregularization,andCIoUloss,andcombinesomeofthemtoachievestate-of-the-artresults: 43.5% AP (65.7% AP50) for the MS COCO dataset at a realtime speed of ∼65 FPS on Tesla V100. Source code is at https://github.com/AlexeyAB/darknet.
Identifying single-amino-acid variants (SAVs) from mass spectrometry-based experiments is critical for validating single-nucleotide variants (SNVs) at the protein level to facilitate biomedical research. Currently, two approaches are usually applied to convert SNV annotations into SAV-harboring protein sequences. One approach generates one sequence containing exactly one SAV, and the other all SAVs. However, they may neglect the possibility of SAV combinations, e.g., haplotypes, existing in bio-samples. Therefore, it is necessary to consider all SAV combinations of a protein when generating SAV-harboring protein sequences. In this paper, we propose MinProtMaxVP, a novel approach which selects a minimized number of SAV-harboring protein sequences generated from the exhaustive approach, while still accommodating all possible variant peptides, by solving a classic set covering problem. Our study on known haplotype variations of TAS2R38 justifies the necessity for MinProtMaxVP to consider all combinations of SAVs. The performance of MinProtMaxVP is demonstrated by an in silico study on OR2T27 with five SAVs and real experimental data of the HEK293 cell line. Furthermore, assuming simulated somatic and germline variants of OR2T27 in tumor and normal tissues demonstrates that when adopting the appropriate somatic and germline SAV integration strategy, MinProtMaxVP is adaptable to labeling and label-free mass spectrometry-based experiments.