Institute of Information Science
近期研究成果
"UTCNN: a Deep Learning Model of Stance Classification on Social Media Text," the 26th International Conference on Computational Linguistics (COLING 2016), 2016.
Authors: Wei-Fan Chen and Lun-Wei Ku

Lun-WeiKuAbstract:
Most neural network models for document classification on social media focus on text infor-mation to the neglect of other information on these platforms.  In this paper, we classify poststance on social media channels and develop UTCNN, a neural network model that incorporatesuser tastes, topic tastes, and user comments on posts.  UTCNN not only works on social mediatexts, but also analyzes texts in forums and message boards. Experiments performed on ChineseFacebook data and English online debate forum data show that UTCNN achieves a 0.755 macro-average f-score for supportive, neutral, and unsupportive stance classes on Facebook data, whichis significantly better than models in which either user, topic, or comment information is with-held. This model design greatly mitigates the lack of data for the minor class without the use ofoversampling. In addition, UTCNN yields a 0.842 accuracy on English online debate forum data,which also significantly outperforms results from previous work as well as other deep learningmodels, showing that UTCNN performs well regardless of language or platform.
"Intelligent Indoor Emergency Evacuation Systems - Reference Architecture and Data Requirements," Proceedings of Future Technologies Conference (FTC 2016), December 2016, December 2016.
Authors: J. W. S. Liu, E. T.H. Chu, F. T Lin and Z. L. Zhong

Jane Win ShihLiuAbstract:

An intelligent indoor emergency evacuation system (IES) can take appropriate risk reduction actions in response to alerts from building safety systems warning of emergencies originated within the building and from government agencies warning of natural disasters affecting surrounding areas. This paper presents data model and reference architecture of IES for large public buildings and case studies to capture their data requirements. Many technical and practical challenges can raise barriers to the wide deployment of IES. Examples of such challenges and solutions to overcome them are also presented.

"Learning to Distill: The Essence Vector Modeling Framework," International Conference on Computational Linguistics (COLING2016), December 2016.
Authors: Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen and Hsin-Min Wang

Hsin-MinWangAbstract:
In the context of natural language processing, representation learning has emerged as a newly active research subject because of its excellent performance in many applications. Learning representations of words is a pioneering study in this school of research. However, paragraph (or sentence and document) embedding learning is more suitable/reasonable for some tasks, such as sentiment classification and document summarization. Nevertheless, as far as we are aware, there is relatively less work focusing on the development of unsupervised paragraph embedding methods. Classic paragraph embedding methods infer the representation of a given paragraph by considering all of the words occurring in the paragraph. Consequently, those stop or function words that occur frequently may mislead the embedding learning process to produce a misty paragraph representation. Motivated by these observations, our major contributions in this paper are twofold. First, we propose a novel unsupervised paragraph embedding method, named the essence vector (EV) model, which aims at not only distilling the most representative information from a paragraph but also excluding the general background information to produce a more informative low-dimensional vector representation for the paragraph. We evaluate the proposed EV model on benchmark sentiment classification and multi-document summarization tasks. The experimental results demonstrate the effectiveness and applicability of the proposed embedding method. Second, in view of the increasing importance of spoken content processing, an extension of the EV model, named the denoising essence vector (D-EV) model, is proposed. The D-EV model not only inherits the advantages of the EV model but also can infer a more robust representation for a given spoken paragraph against imperfect speech recognition. The utility of the D-EV model is evaluated on a spoken document summarization task, confirming the practical merits of the proposed embedding method in relation to several well-practiced and state-of-the-art summarization methods.
"Delegating RAM Computations with Adaptive Soundness and Privacy," Fourteenth IACR Theory of Cryptography Conference - TCC 2016-B, November 2016.
Authors: Prabhanjan Ananth, Yu-Chi Chen, Kai-Min Chung, Huijia Lin and Wei-Kai Lin

Kai-MinChungAbstract:
    We consider the problem of delegating RAM computations over persistent databases: A user wishes to delegate a sequence of computations over a database to a server, where each compuation may read and modify the database and the modifications persist between computations. For the efficiency of the server, it is important that computations are modeled as RAM programs, for their runtime may be sub-linear in the size of the database.
    Two security needs arise in this context: Ensuring Intergrity, by designing means for the server to compute short proofs that allows the user to efficiently verify the correctness of the server computation, and privacy, providing means for the user to hide his private databases and programs from a malicious server. In this work, we aim to address both security needs, especially in the stringent, adaptive, setting, where the sequence of RAM computations are (potentially) chosen adaptively by a malicious server depending on the messages from an honest user.
    To this end, we construct the first RAM delegation scheme achieving both adaptive integrity (a.k.a. soundness) and adaptive privacy, assuming the existence of indistinguishability obfuscation for circuits and a variant of the two-to-one somewhere perfectly binding hash [Okamoto et al. ASIACRYPT’15] (the latter can be based on the decisional Diffie-Hellman assumption). Prior works focused either only on adaptive soundness [Kalai and Paneth, ePrint’15] or on the weaker variant, selective soundness and privacy [Chen et al. ITCS’16, Canetti and Holmgren ITCS’16].
    At a high-level, our result is obtained by applying a generic “security lifting technique” to the delegation scheme of Chen et al. and its proof of selective soundness and privacy. The security lifting technique formalizes an abstract framework of selective security proofs, and generically “lifts” such proofs into proofs of adaptive security. We believe that this technique can potentially be applied to other cryptographic schemes and is of independent interest.
Current Research Results
"Space-Efficient Index Scheme for PCM-based Multiversion Databases in Cyber-Physical Systems," ACM Transactions on Embedded Computing Systems (TECS), October 2016.
Authors: Yuan-Hung Kuan, Yuan-Hao Chang, Tseng-Yi Chen, Po-Chun Huang, and Kam-Yiu Lam

Yuan-HaoChangAbstract:
In this article, we study the indexing problem of usingPCMas the storagemedium for embeddedmultiversion databases in cyber-physical systems (CPSs). Although the multiversion B+-tree (MVBT) index has been shown to be efficient in managing multiple versions of data items in a database, MVBT is designed for databases residing in traditional block-oriented storage devices. It can have serious performance problems when the databases are on phase-change memory (PCM). Since the embeddedmultiversion database in CPSsmay have limited storage space and are update intensive, to resolve the problems of MVBT of lack of space efficiency and heavy update cost, we propose a new index scheme, called space-efficient multiversion index (SEMI), to enhance the space utilization and access performance in serving various types of queries. In SEMI, since the number of keys in the database may be small, instead of using a B-tree index, we propose to use a binary-search tree to organize the index keys. Furthermore, multiple versions of the same data item may be stored consecutively and indexed by a single entry to maximize the space utilization and at the same time to enhance the performance in serving version-range queries. Analytical studies have been conducted on SEMI, and a series of experiments have been performed to evaluate its performance as compared withMVBT under different workloads. The experimental results have demonstrated that SEMI can achieve very high space utilization and has better performance in serving update transactions and range queries as compared with MVBT.
Current Research Results
"SPIRIT: A Tree Kernel-based Method for Topic Person Interaction Detection," IEEE Transactions on Knowledge and Data Engineering (TKDE), August 2016.
Authors: Yung-Chun Chang, Chien Chin Chen, and Wen-Lian Hsu

Wen-LianHsuYung-ChunChangAbstract:
The development of a topic in a set of topic documents is constituted by a series of person interactions at a specific time and place. Knowing the interactions of the persons mentioned in these documents is helpful for readers to better comprehend the documents. In this paper, we propose a topic person interaction detection method called SPIRIT, which classifies the text segments in a set of topic documents that convey person interactions. We design the rich interactive tree structure to represent syntactic, context, and semantic information of text, and this structure is incorporated into a tree-based convolution kernel to identify interactive segments. Experiment results based on real world topics demonstrate that the proposed rich interactive tree structure effectively detects the topic person interactions and that our method outperforms many well-known relation extraction and protein-protein interaction methods.
Current Research Results
Authors: Yuxiang Jiang, ..., Caster Chen, Wen-Lian Hsu et al.

Wen-LianHsuAbstract:
Background: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. Results: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. Conclusions: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.
Current Research Results
Authors: Sheng-Yao Su, Shu-Hwa Chen, I-Hsuan Lu, Yih-Shien Chiang, Yu-Bin Wang, Pao-Yang Chen, Chung-Yen Lin*

Chung-YenLinI-Hsuan LuShu-Hwa ChenSheng-Yao SuAbstract:
Background: Bisulfite sequencing (BS-seq) has become a standard technology to profile genome-wide DNA methylation at single-base resolution. It allows researchers to conduct genome-wise cytosine methylation analyses on issues about genomic imprinting, transcriptional regulation, cellular development and differentiation. One single data from a BS-Seq experiment is resolved into many features according to the sequence contexts, making methylome data analysis and data visualization a complex task. Results: We developed a streamlined platform, TEA, for analyzing and visualizing data from whole-genome BS-Seq (WGBS) experiments conducted in the model plant Arabidopsis thaliana. To capture the essence of the genome methylation level and to meet the efficiency for running online, we introduce a straightforward method for measuring genome methylation in each sequence context by gene. The method is scripted in Java to process BS-Seq mapping results. Through a simple data uploading process, the TEA server deploys a web-based platform for deep analysis by linking data to an updated Arabidopsis annotation database and toolkits. Conclusions: TEA is an intuitive and efficient online platform for analyzing the Arabidopsis genomic DNA methylation landscape. It provides several ways to help users exploit WGBS data.
TEA is freely accessible for academic users at: http://tea.iis.sinica.edu.tw.
Current Research Results
"Efficient Warranty-Aware Wear-Leveling for Embedded Systems with PCM Main Memory," IEEE Transactions on Very Large Scale Integration Systems (TVLSI), July 2016.
Authors: Sheng-Wei Cheng, Yuan-Hao Chang, Tseng-Yi Chen, Yu-Fen Chang, Hsin-Wen Wei, and Wei-Kuan Shih

Yuan-HaoChangAbstract:
Recently, Phase Change Memory (PCM) becomes a promising candidate to replace DRAM as main memory due to its low power consumption, fast I/O performance, and byte addressability. Accompanied with the merits, the adoption of PCM may suffer from its physical characteristic of limited write endurance. Wear leveling is a well-known approach to address this issue. For PCM main memory, the design of wear leveling should stress operation efficiency and overhead reduction. Nevertheless, conventional designs are usually dedicated to prolonging the lifetime of PCM in the best effort. In this paper, we propose a novel perspective that, instead of valuing PCM lifetime exploitation as the first priority, we turn to satisfy the product warranty period. With such a paradigm shift, the management overhead of wear leveling mechanisms could be reduced so as to achieve further enhancement of operation efficiency. To this end, we propose a warranty-aware page management design that introduces novel criteria used to determine the state of a page by taking both the product warranty period and the write cycles of a page into consideration. Theoretical analysis is also conducted to investigate properties and performance of the proposed management. To show the effectiveness of the proposed design, we collected real traces by running SPEC2006 benchmarks with different write intensity workloads. The experimental results showed that our design reduced the overhead to one-third that of the state-of-the-art designs while still providing the same level of performance.
Current Research Results
"Improving PCM Endurance with a Constant-cost Wear Leveling Design," ACM Transactions on Design Automation of Electronic Systems (TODAES), June 2016.
Authors: Yu-Ming Chang, Pi-Cheng Hsiu, Yuan-Hao Chang, Chi-Hao Chen, Tei-Wei Kuo, and Cheng-Yuan Michael Wang

Yuan-HaoChangPi-ChengHsiuAbstract:
Improving PCM endurance is a fundamental issue when it is considered as an alternative to replace DRAM as main memory. Memory-based wear leveling is an effective way to improve PCM endurance, but its major challenge is how to efficiently determine the appropriate memory pages for allocation or swapping. In this paper, we present a constant-cost wear leveling design that is compatible with existing memory management. Two implementations, namely bucket-based and array-based wear leveling, with constant-time (or nearly zero) search cost are proposed to be integrated into the OS layer and the hardware layer respectively, as well as to trade between time and space complexity. The results of experiments conducted based on an implementation in Android, as well as simulations with popular benchmarks, to evaluate the effectiveness of the proposed design are very encouraging.
Current Research Results
Authors: Jen-Chieh Lee, Sheng-Yao Su, Chun A, Changou, Rong-Sen Yang, Keh-Sung Tsai, Michael T. Collins, Eric S. Orwoll, Chung-Yen Lin, Shu-Hwa Chen, Shyang-Rong Shih, Chen-Han Lee, Yoshinao Oda, Steven D. Billings, Chien-Feng Li, G. Petur Nielsen, Eiichi Konishi, Fredrik Petersson, Thomas O. Carpenter, Hsuan-Ying Huang, and Andrew L. Folpe

Chung-YenLinAbstract:
Phosphaturic mesenchymal tumors typically cause paraneoplastic osteomalacia, chiefly as a result of FGF23 secretion. In a prior study, we identified FN1-FGFR1 fusion in 9 of 15 phosphaturic mesenchymal tumors. In this study, a total of 66 phosphaturic mesenchymal tumors and 7 tumors resembling phosphaturic mesenchymal tumor but without known phosphaturia were studied. A novel FN1-FGF1 fusion gene was identified in two cases without FN1-FGFR1 fusion by RNA sequencing and cross-validated with direct sequencing and western blot. Fluorescence in situ hybridization analyses revealed FN1-FGFR1 fusion in 16 of 39 (41%) phosphaturic mesenchymal tumors and identified an additional case with FN1-FGF1 fusion. The two fusion genes were mutually exclusive. Combined with previous data, the overall prevalence of FN1-FGFR1 and FN1-FGF1 fusions was 42% (21/50) and 6% (3/50), respectively. FGFR1 immunohistochemistry was positive in 82% (45/55) of phosphaturic mesenchymal tumors regardless of fusion status. By contrast, 121 cases of potential morphologic mimics (belonging to 13 tumor types) rarely expressed FGFR1, the main exceptions being solitary fibrous tumors (positive in 40%), chondroblastomas (40%), and giant cell tumors of bone (38%), suggesting a possible role for FGFR1 immunohistochemistry in the diagnosis of phosphaturic mesenchymal tumor. With the exception of one case reported in our prior study, none of the remaining tumors resembling phosphaturic mesenchymal tumor had either fusion type or expressed significant FGFR1. Our findings provide insight into possible mechanisms underlying the pathogenesis of phosphaturic mesenchymal tumor and imply a central role of the FGF1-FGFR1 signaling pathway. The novel FN1-FGF1 protein is expected to be secreted and serves as a ligand that binds and activates FGFR1 to achieve an autocrine loop. Further study is required to determine the functions of these fusion proteins.Modern Pathology advance online publication, 22 July 2016; doi:10.1038/modpathol.2016.137.
Current Research Results
Authors: Tzu-Po Chuang, Jaw-Yuan Wang, Shu-Wen Jao, Chang-Chieh Wu, Jiann-Hwa Chen, Koung-Hung Hsiao, Chung-Yen Lin, Shu-Hwa Chen, Sheng-Yao Su, Ying-Ju Chen, Yuan-Tsong Chen, Deng-Chyang Wu, Ling-Hui Li

Chung-YenLinAbstract:
Development of colorectal cancer (CRC) involves sequential transformation of normal mucosal tissues into benign adenomas and then adenomas into malignant tumors. The identification of genes crucial for malignant transformation in colorectal adenomas (CRAs) has been based primarily on cross-sectional observations. In this study, we identified relevant genes using autologous samples. By performing genome-wide SNP genotyping and RNA sequencing analysis of adenocarcinomas, adenomatous polyps, and non-neoplastic colon tissues (referred as tri-part samples) from individual patients, we identified 68 genes with differential copy number alterations and progressively dysregulated expression. Aurora A, SKA3, and DSN1 protein levels were sequentially up-regulated in the samples, and this overexpression was associated with chromosome instability (CIN). Knockdown of SKA3 in CRC cells dramatically reduced cell growth rates and increased apoptosis. Depletion of SKA3 or DSN1 induced G2/M arrest and decreased migration, invasion, and anchorage-independent growth. AURKA and DSN1 are thus critical for chromosome 20q amplification-associated malignant transformation in CRA. Moreover, SKA3 at chromosome 13q was identified as a novel gene involved in promoting malignant transformation. Evaluating the expression of these genes may help identify patients with progressive adenomas, helping to improve treatment.
Authors: Chun-Ming Chang, Cheng-Hsin Hsu, Chih-Fan Hsu, and Kuan-Ta Chen

Sheng-Wei(Kuan-Ta)ChenAbstract:
We propose the very first non-intrusive measurement methodology for quantifying the performance of commodity Virtual Reality (VR) systems. Our methodology considers the VR system under test as a black-box and works with any VR applications. Multiple performance metrics on timing and positioning accuracy are considered, and detailed testbed setup and measurement steps are presented. We also apply our methodology to several VR systems in the market, and carefully analyze the experiment results. We make several observations: (i) 3D scene complexity affects the timing accuracy the most, (ii) most VR systems implement the dead reckoning algorithm, which incurs a non-trivial correction latency after incorrect predictions, and (iii) there exists an inherent trade-off between two positioning accuracy metrics: precision and sensitivity.
"A Disturbance-aware Sub-Block Design to Improve Reliability of 3D MLC Flash Memory," ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2016.
Authors: Hung-Sheng Chang, Yuan-Hao Chang, Tei-Wei Kuo, Yu-Ming Chang, and Hsiang-Pang Li

Yuan-HaoChangAbstract:
The reliability problem of modern flash-memory chips quickly deteriorates because of the nature of MLC chips. Although the vertical stacking of storage cells in 3D flash-memory chips dramatically increases the bit density, compared to 2D chips, it also results in severe disturbance problems. In this work, we propose to create sub-blocks in the MTD layer by considering different program disturb resistance patterns, without any hardware cost. In particular, a disturbance-aware sub-block design is proposed to utilize the hotness information of data to further improve the reliability of 3D MLC flash memory by smartly choosing sub-blocks to use when extra information is available from the flash translation layer. The proposed design was evaluated by a series of experiments over traces from SNIA. It was shown that the disturb errors could be reduced by at least 50% under DFTL and BL, i.e., a page-level FTL and a block-level FTL design, compared to the conventional MLC programming designs.
"How to Enable Software Isolation and Boost System Performance with Sub-block Erase over 3D Flash Memory," ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2016.
Authors: Hsin-Yu Chang, Chien-Chung Ho, Yuan-Hao Chang, Yu-Ming Chang, and Tei-Wei Kuo

Yuan-HaoChangAbstract:
The write amplification problem deteriorates as the block size of modern flash-memory chips keeps increasing. Without careful garbage collection, significant live-page copying might even worsen the reliability problem, that is already severe to 3D flash memory. In this work, we propose a sub-block erase design to not only alleviate the write amplification problem by reducing live-page copying but also improve the system reliability with a software isolation strategy. In particular, sub-blocks are carefully allocated to satisfy write requests so as to reduce disturbance by using free or invalid sub-blocks as isolation layers among sub-blocks, without additional hardware cost and capacity loss. A series of experiments were conducted to evaluate the capability of the proposed design. The results show that the proposed design is very effective in improving the system performance by reducing garbage collection overheads and in improving the device reliability/lifetime.
"Towards Ultra-Low-Bitrate Video Conferencing Using Facial Landmarks," Proceedings of ACM Multimedia 2016, October 2016.
Authors: Pin Chun Wang, Ching-Ling Fan, Chun-Ying Huang, Kuan-Ta Chen, and Cheng-Hsin Hsu

Sheng-Wei(Kuan-Ta)ChenAbstract:
Providing high-quality video conferencing experience over the best-effort Internet and wireless networks is challenging, because 2D videos are bulky. In this paper, we exploit the common structure of conferencing videos for an ultra-low-bitrate video conferencing system. In particular, we design, implement, optimize, and evaluate a video conferencing system, which: (i) extracts facial landmarks, (ii) transmits the selected facial landmarks and 2D images, and (iii) warps the untransmitted 2D images at the receiver. Several optimization techniques are adopted for minimizing the running time and maximizing the video quality, e.g., the image and warping frames are optimally determined based on network conditions and video content. The experiment results from real conferencing videos reveal that our proposed system: (i) outperforms the stateof- the-art x265 by up to 11.05 dB in PSNR (Peak Signal-to-Noise Ratio), (ii) adapts to different video content and network conditions, and (iii) runs in real-time at about 12 frame-per-second.
"Realizing Erase-free SLC Flash Memory with Rewritable Programming Design," ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2016.
Authors: Yu-Ming Chang, Yung-Chun Li, Ping-Hsien Lin, Hsiang-Pang Li, and Yuan-Hao Chang

Yuan-HaoChangAbstract:
Over the past years, the adaptation of flash memory has seen a tremendous growth in a wide range of fields, with high-end applications demanding ever higher reliability and performance for storage devices. Even though a single-level-cell (SLC) flash memory can have a higher endurance and a lower access latency to better meet that requirement in comparison with a multi-level-cell (MLC) one, the rapid deterioration of reliability due to advanced shrinking technology means that a flash block can only survive over a smaller and smaller number of erase cycles. This work proposes an erase-free scheme, which comprises the design of rewritable programming and software management, to enable a rewritable usage model that can reuse flash blocks several times without using erase operations in an SLC flash memory based storage system. The evaluation results show that the proposed scheme achieves 6-time rewritable capability such that 83% erase operations can be removed; in the best settings, the write latency can be reduced by 3.67x on average under various workloads.
"Queueing and glueing for optimal partitioning," 21st ACM SIGPLAN International Conference on Functional Programming (ICFP 2016), September 2016.
Authors: Shin-Cheng Mu, Yu-Hsi Chiang, and Yu-Han Lyu

Shin-ChengMuAbstract:
The queueing-glueing algorithm is the nickname we give to an algorithmic pattern that provides amortised linear time solutions to a number of optimal list partition problems that have a peculiar property: at various moments we know that two of three candidate solutions could be optimal. The algorithm works by keeping a queue of lists, glueing them from one end, while chopping from the other end, hence the name. We give a formal derivation of the algorithm, and demonstrate it with several non-trivial examples.
Code accompanying this paper is available on GitHub: https://github.com/scmu/queueing-glueing.
Current Research Results
"Multi-grained Block Management to Enhance the Space Utilization of File Systems on PCM Storages," IEEE Transactions on Computers (TC), June 2016.
Authors: Tseng-Yi Chen, Yuan-Hao Chang, Ming-Chang Yang, Yun-Jhu Chen, Hsin-Wen Wei, and Wei-Kuan Shih

Yuan-HaoChangAbstract:
Phase-change memory (PCM) is a promising candidate as a storage medium to resolve the performance gap between main memory and storage in battery-powered mobile computing systems. However, it is more expensive than flash memory, and thus introduces a more serious storage capacity issue for low-cost solutions. This issue is further exacerbated by the fact that existing file systems are usually designed to trade space utilization for performance over block-oriented storage devices. In this work, we propose a multi-grained block management strategy to improve the space utilization of file systems over PCM-based storage systems. By utilizing the byte-addressability and fast read/write feature of PCM, a methodology is proposed to dynamically allocate multiple sizes of blocks to fit the size of each file, so as to resolve the space fragmentation issue with minimized space and management overheads. The space utilization of file systems is analyzed with consideration of block sizes. A series of experiments was conducted to evaluate the efficacy of the proposed strategy, and the results show that the proposed strategy can significantly improve the space utilization of file systems.
Current Research Results
"Byte-addressable Update Scheme to Minimize the Energy Consumption of PCM-based Storage Systems," ACM Transactions on Embedded Computing Systems (TECS), June 2016.
Authors: Ming-Chang Yang, Yuan-Hao Chang, and Che-Wei Tsao

Yuan-HaoChangAbstract:
In recent years, phase-change memory (PCM) has generated a great deal of interest because of its byte-addressability and non-volatility properties. It is regarded as a good alternative storage medium that can reduce the performance gap between the main memory and the secondary storage in computing systems. However its high energy consumption on writes is a challenging issue in the design of battery-powered mobile computing systems. To reduce the energy consumption, we exploit the byte-addressability and the asymmetric read-write energy/latency of PCM in an energy-efficient update scheme for journaling file systems. We also introduce a concept called the fifty-percent rule to determine/recommend the best update strategy for block updates. The proposed scheme only writes modified data, instead of the whole updated block, to PCM-based storage devices without extra hardware support. Moreover, it guarantees the sanity/integrity of file systems even if the computing system crashes or there is a power failure during the data update process. We implemented the proposed scheme on the Linux system and conducted a series of experiments to evaluate the scheme. The results are very encouraging.
Current Research Results
"A Computer-Assistance Learning System for Emotional Wording," IEEE Transactions on Knowledge and Data Engineering, May 2016.
Authors: Wei-Fan Chen, Mei-Hua Chen, Ming-Lung Chen, and Lun-Wei Ku

Lun-Wei KuWei-Fan ChenAbstract:
Language learners' limited lexical knowledge leads to imprecise wording. This is especially true when they attempt to express their emotions. Many learners rely heavily on the traditional thesaurus. Unfortunately, this fails to provide appropriate suggestions for lexical choices. To better aid English-as-a-second-language learners with word choices, we propose RESOLVE, which provides ranked synonyms of emotion words based on contextual information. RESOLVE suggests precise emotion words regarding the events in the relevant context. Patterns are learned to capture emotion events, and various factors are considered in the scoring function for ranking emotion words. We also describe an online writing system developed using RESOLVE and evaluate its effectiveness for learning assistance with a writing task. Experimental results showed that RESOLVE yielded a superior performance on NDCG@5 which significantly outperformed both PMI and SVM approaches, and offered better suggestions than Roget's Thesaurus and PIGAI (an online automated essay scoring system). Moreover, when applying it to the writing task, students' appropriateness with emotion words was 30 percent improved. Less-proficient learners benefited more from RESOLVE than highly-proficient learners. Post-tests also showed that after using RESOLVE, less-proficient learners' ability to use emotion words approached that of highly-proficient learners. RESOLVE thus enables learners to use precise emotion words.

相關資訊

Academia Sinica 資訊科學研究所 Academia Sinica