Current Research Results
"On the Quality of Service of Cloud Gaming Systems," IEEE Transactions on Multimedia, February 2014.
Authors: Kuan-Ta Chen, Yu-Chun Chang, Hwai-Jung Hsu, De-Yu Chen, Chun-Ying Huang, and Cheng-Hsin Hsu

Abstract:
Cloud gaming, i.e., real-time game playing via thin clients, relieves users from being forced to upgrade their com- puters and resolve the incompatibility issues between games and computers. As a result, cloud gaming is generating a great deal of interests among entrepreneurs, venture capitalists, general publics, and researchers. However, given the large design space, it is not yet known which cloud gaming system delivers the best user-perceived Quality of Service (QoS) and what design elements constitute a good cloud gaming system.

This study is motivated by the question: How good is the QoS of current cloud gaming systems? Answering the question is challenging because most cloud gaming systems are proprietary and closed, and thus their internal mechanisms are not accessible for the research community. In this paper, we propose a suite of measurement techniques to evaluate the QoS of cloud gaming systems and prove the effectiveness of our schemes using a case study comprising two well-known cloud gaming systems: OnLive and StreamMyGame. Our results show that OnLive performs better, because it provides adaptable frame rates, better graphic quality, and shorter server processing delays, while consuming less network bandwidth. Our measurement techniques are general and can be applied to any cloud gaming systems, so that researchers, users, and service providers may systematically quantify the QoS of these systems. To the best of our knowledge, the proposed suite of measurement techniques have never been presented in the literature.

Current Research Results
"Reliability Enhancement of Flash-Memory Storage Systems: An Efficient Version-Based Design," IEEE Transactions on Computers, December 2013.
Authors: Yuan-Hao Chang, Po-Chun Huang, Pei-Han Hsu, Lue-Jane Lee, Tei-Wei Kuo and David Hung-Chang Du

Abstract:
In recent years, reliability has become one critical issue in the designs of flash-memory file/storage systems, due to the growing unreliability of advanced flash-memory chips. In this paper, a version-based strategy is proposed to effectively and efficiently maintain the consistency among page versions of a file for potential recovery needs. In particular, a two-version one for a native file system is presented with the minimal overheads in version maintenance. A recovery scheme is then presented to restore a corrupted file back to the latest consistent version. The strategy is later extended to maintain multiple data versions with the considerations of the write constraints of multi-level-cell flash memory. It was shown that the proposed strategy could significantly improve the reliability of flash memory with limited management and space overheads.
Current Research Results
"Automatic Training Image Acquisition and Effective Feature Selection from Community-Contributed Photos for Facial Attribute Detection," IEEE Transactions on Multimedia, October 2013.
Authors: Yan-Ying Chen, Winston Hsu, and H. Y. Mark Liao

Abstract:
Facial attributes are shown effective for mining specific persons and profiling human activities in large-scale media such as surveillance videos or photo-sharing services. For comprehensive analysis, a rich number of facial attributes
is required. Generally, each attribute detector is obtained by supervised learning via the use of large training data. It is promising to leverage the exponentially growing community contributed photos and the associated informative contexts to ease the burden of manual annotation; however, such huge noisy data from the Internet still pose great challenges. We propose to measure the quality of training images by discriminable visual features, which are verified with the relative discrimination between the unlabeled images and the pseudo-positives (pseudo-negatives) retrieved by textual relevance. The proposed feature selection requires no heuristic threshold, therefore, can be generalized to multiple feature modalities. We further exploit the rich context cues (e.g., tags, geo-locations, etc.) associated with the publicly available photos for mining more semantically consistent but visually diverse training images around the world. Experimenting in the benchmarks, we demonstrate that our work can successfully acquire effective training images for learning generic facial attributes, where the classification error is relatively reduced up to 23.35% compared with that of the text-based approach and shown comparable with that of costly manual annotations.
Current Research Results
"An Index-Based Management Scheme with Adaptive Caching for Huge-Scale Low-Cost Embedded Flash Storages," ACM Transactions on Design Automation of Electronic Systems, October 2013.
Authors: Po-Chun Huang, Yuan-Hao Chang, and Tei-Wei Kuo

Abstract:
Due to its remarkable access performance, shock resistance and costs, NAND flash memory is now widely adopted in a variety of computing environments, especially in mobile devices such as smart phones, media players and electronic book readers. For the consideration of costs, low-cost embedded flash storages such as flash memory cards are often employed on such devices. Different from solid-state disks, the RAM buffer equipped on low-cost embedded flash storages are very small, for example, limited under tens of kilobytes, despite of the rapidly growing capacity of the storages. The significance of effectively utilizing the very limited on-device RAM buffers of embedded flash storages is therefore highlighted, and a novel design of scalable flash management schemes is needed to tackle the new access constraints of MLC NAND flash memory. In this work, a highly-scalable design of the flash translation layer is presented with the considerations of the on-device RAM size, user access patterns, address-mapping-information caching and MLC access constraints. Through a series of experiments, it is verified that, with appropriate settings of cache sizes, the proposed management scheme provides comparable performance results to prior arts with much lower requirements on the on-device RAM. In other words, the proposed scheme suggests a strategy to make better use of the on-device RAM, and is suitable for embedded flash storages.
"Rating Equity Funds against Return of Random Traders," Proceeding International Conference on Economic Computing, September 2013.
Authors: Ta-Wei Hung, Mu-En Wu, Hsueh-I Lu, Jan-Ming Ho

Abstract:
In this paper, we study the problem of benchmarking returns ofequity funds. We present a novel approach, the random-traderscheme, to benchmark return of an equity fund during a specificperiod. A random trader uses all-in-all-out strategy to trade inthe market at random timing with capital being negligible ascompared with the market size. Let $\mathrm{DRRT}$ denote the distribution of returns of randomtraders, and $\mathrm{R_{rt}}$ be a random variable sampled from$\mathrm{DRRT}$. In this paper we model $\mathrm{DRRT}$ as alog-normal distribution, denoted as $\mathrm{s_{DRRT}}$, andprovide an efficient algorithm to compute the mean and variance of$\mathrm{s_{DRRT}}$, denoted as $\mathrm{\mu_{DRRT}}$ and$\mathrm{\sigma_{DRRT}}$, respectively. Using TAIEX index as dataset, our experiments show that $\mathrm{s_{DRRT}}$ approximates$\mathrm{DRRT}$ well when the length of given period is one month. We then score each equity fund by the cumulative distributionfunction of $\mathrm{s_{DRRT}}$ i.e., $\mathrm{s}(m;\mathrm{DRRT) =Pr[R_{rt} \le R}(m)]$, where $\mathrm{R}(m)$ denotes the return of the equity fund $m$during the given period. Using the historical data on equity fundsin Taiwan, we observed interesting characteristics. When analyzingmonthly returns, although there are some winners who are able toachieve scores higher than $.9$ sometimes, it is difficult forthem to always keep up with the high scores. Moreover, few equityfunds showed the stability of scores higher than $.7$, whenanalyzing long-term returns. Furthermore, there are times whenmost funds obtain high scores. There are also times when no fundsperform well.
Current Research Results
"Rating Equity Funds against Return of Random Traders," ASE Science Journal, September 2013.
Authors: Ta-Wei Hung, Mu-En Wu, Hsueh-I Lu, Jan-Ming Ho

Abstract:
In this paper, we study the problem of benchmarking returns ofequity funds. We present a novel approach, the random-traderscheme, to benchmark return of an equity fund during a specificperiod. A random trader uses all-in-all-out strategy to trade inthe market at random timing with capital being negligible ascompared with the market size.
Let $\\mathrm{DRRT}$ denote the distribution of returns of randomtraders, and $\\mathrm{R_{rt}}$ be a random variable sampled from$\\mathrm{DRRT}$. In this paper we model $\\mathrm{DRRT}$ as alog-normal distribution, denoted as $\\mathrm{s_{DRRT}}$, andprovide an efficient algorithm to compute the mean and variance of$\\mathrm{s_{DRRT}}$, denoted as $\\mathrm{\\mu_{DRRT}}$ and$\\mathrm{\\sigma_{DRRT}}$, respectively. Using TAIEX index as dataset, our experiments show that $\\mathrm{s_{DRRT}}$ approximates$\\mathrm{DRRT}$ well when the length of given period is one month.
We then score each equity fund by the cumulative distributionfunction of $\\mathrm{s_{DRRT}}$ i.e., $\\mathrm{s}(m;\\mathrm{DRRT) =Pr[R_{rt} \\le R}(m)]$, where $\\mathrm{R}(m)$ denotes the return of the equity fund $m$during the given period. Using the historical data on equity fundsin Taiwan, we observed interesting characteristics. When analyzingmonthly returns, although there are some winners who are able toachieve scores higher than $.9$ sometimes, it is difficult forthem to always keep up with the high scores. Moreover, few equityfunds showed the stability of scores higher than $.7$, whenanalyzing long-term returns. Furthermore, there are times whenmost funds obtain high scores. There are also times when no fundsperform well.
"Intelligent Tools for Reducing Medication Dispensing and Administration Errors," Proceedings of the 3rd International Symposium on Foundations of Health information Engineering and Systems,, August 2013.
Authors: P. H. Tsai and J. W. S. Liu

Abstract:

This paper presents an overview of smart medication dispensing and administration devices and software tools designed to minimize dispensing and administration errors. Some of them are for users who take medications on a long term basis without close professional supervision; others are for pharmacy and nursing staffs in hospitals, long term care, and assisted living facilities. These tools should be configurable, customizable, easy to use and effective for diverse users and care-providing institutions. The paper describes approaches taken to meet these objectives.

"Selective Profiling for OS Scalability Study on Multicore Systems," IEEE International Conference on Service-Oriented Computing and Applications (SOCA), December 2013.
Authors: Kuo-Yi Chen, Yuan-Hao Chang, Pei-Shu Liao, Pen-Chung Yew, Sheng-Wei Cheng, and Tien-Fu Chen

Abstract:
With the wide deployment of multi-core processors, the scalability is becoming an important issue of modern computers.  In order to detail the scalability issues of an operating system, the performance profilers are usually used to detect the scalability bottlenecks. However, profilers usually go alone with the significant overheads. The overheads not only reduce the accuracy of scalability profiling, but also mislead to the actual scalability bottlenecks. In order to improve this issue, a selective-profiling approach is proposed to analyze the scalability of an operating system precisely with minimum overheads. In proposed selective-profiling approach, the possible scalability bottleneck hotspots are located by the tracers first, and then the information of possible bottleneck hotspots is passed to the sampler to detect actual bottlenecks.  Since the sampler only concentrates on the possible bottleneck hotspots instead of a whole system, the overhead can be reduced significantly. Therefore, the proposed selective-profiling approach takes the advantages of performance tracers and samplers both to reduce the overhead and still reach the accurate bottleneck analysis for an operating system in multi-core systems.
"Interactive Coding, Revisited," The 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS), October 2013.
Authors: Kai-Min Chung and Rafael Pass and Sidharth Telang

Abstract:
How can we encode a communication protocol between two parties to become resilient to adversarial errors on the communication channel? If we encode each message in the communication protocol with agood'' error-correcting code (ECC), the error rate of the encoded protocol becomes poor (namely $O(1/m)$ where $m$ is the number of communication rounds). Towards addressing this issue, Schulman (FOCS'92, STOC'93) introduced the notion of *interactive coding*.

We argue that whereas the method of separately encoding each message with an ECC ensures that the encoded protocol carries the same amount of information as the original protocol, this may no longer be the case if using interactive coding. In particular, the encoded protocol may completely leak a player's private input, even if it would remain secret in the original protocol. Towards addressing this problem, we introduce the notion of *knowledge-preserving interactive coding*, where the interactive coding protocol is required to preserve the ''knowledge'' transmitted in the original protocol. Our main results are as follows.

- The method of separately applying ECCs to each message has essentially optimal error rate: No knowledge-preserving interactive coding scheme can have an  error rate of $1/m$, where $m$ is the number of rounds in the original protocol.

- If restricting to computationally-bounded (polynomial-time) adversaries, then assuming the existence of one-way functions (resp. subexponentially-hard one-way functions), for every $\epsilon>0$, there exists a knowledge-preserving interactive coding schemes with constant error rate and information rate $n^{-\epsilon}$ (resp. $1/\polylog(n)$) where $n$ is the security parameter; additionally to achieve an error of even $1/m$ requires the existence of one-way functions.

- Finally, even if we restrict to computationally-bounded adversaries, knowledge-preserving interactive coding schemes with constant error rate can have an information rate of at most $o(1/\log n)$. This results applies even to *non-constructive* interactive coding schemes.
"Simultaneous Resettability from One-Way Functions," The 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS), October 2013.
Authors: Kai-Min Chung and Rafail Ostrovsky and Rafael Pass and Ivan Visconti

Abstract:
Resettable-security, introduced by Canetti, Goldreich, Goldwasser and Micali (STOC'00), considers the security of cryptographic two-party protocols (in particular zero-knowledge arguments) in a setting where the attacker may reset'' or rewind'' one of the players. The strongest notion of resettable security, simultaneous resettability, introduced by Barak, Goldreich, Goldwasser and Lindell (FOCS'01), requires resettable security to hold for both parties: in the context of zero-knowledge, both the soundness and the zero-knowledge conditions remain robust to resetting attacks.

To date, all known constructions of protocols satisfying simultaneous resettable security rely on the existence of ZAPs; constructions of ZAPs are only known based on the existence of trapdoor permutations or number-theoretic assumptions.

In this paper, we provide a new method for constructing protocols satisfying simultaneous resettable security while relying only on the minimal assumption of one-way functions. Our key results establish, assuming only one-way functions:

- Every language in $\NP$ has an $\omega(1)$-round simultaneously resettable witness indistinguishable argument system.

- Every language in $\NP$ has a (polynomial-round) simultaneously resettable zero-knowledge argument system.

The key conceptual insight in our technique is relying on black-box impossibility results for concurrent zero-knowledge to achieve resettable-security.
"Constant-Round Concurrent Zero Knowledge From P-Certificates," The 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS), October 2013.
Authors: Kai-Min Chung and Huijia Lin and Rafael Pass

Abstract:
We present a constant-round concurrent zero-knowledge protocol for $\NP$. Our protocol relies on the existence of families of collision-resistant hash functions, and a new, but in our eyes, natural complexity-theoretic assumption: the existence of P-certificates---that is, "succinct'' non-interactive proofs/arguments for $\P$. As far as we know, our results yield the first constant-round concurrent zero-knowledge protocol for $\NP$ with an explicit zero-knowledge simulator based on any assumption.
"CloudRS: An Error Correction Algorithm of High-Throughput Sequencing Data," Proceedings IEEE BigData 2013, October 2013.
Authors: Chien-Chih Chen, Yu-Jung Chang, Wei-Chun Chung, Der-Tsai Lee, and Jan-Ming Ho

Abstract:
Next-generation sequencing (NGS) technologies produce huge amounts of data. These sequencing data unavoidably are accompanied by the occurrence of sequencing errors which constitutes one of the major problems of further analyses. Error correction is indeed one of the critical steps to the success of NGS applications such as de novo genome assembly and DNA resequencing as illustrated in literature. However, it requires computing time and memory space heavily. To design an algorithm to improve data quality by efficiently utilizing on-demand computing resources in the cloud is a challenge for biologist and computer scientists. In this study, we present an error-correction algorithm, called the CloudRS algorithm, for correcting errors in NGS data. The CloudRS algorithm aims at emulating the notion of error correction algorithm of Allpath-LG on the Hadoop/ MapReduce framework. It is conservative in correcting sequencing errors to avoid introducing false decisions, e.g., when dealing with reads from repetitive regions. We also illustrate several probabilistic measures we introduce into CloudRS to make the algorithm more efficient without sacrificing its effectiveness. Running time of using up to 80 instances each with 8 computing units shows satisfactory speedup. Experiments of comparing with other error correction programs show that CloudRS algorithm performs lower false positive rate for most evaluation benchmarks and higher sensitivity on genome S. cerevisiae. We demonstrate that CloudRS algorithm provides significant improvements in the quality of the resulting contigs on benchmarks of NGS de novo assembly.
"A Cooperative Botnet Profiling and Detection in Virtualized Environment," IEEE Conference on Communications and Network Security (CNS), October 2013.
Authors: Shun-Wen Hsiao, Yi-Ning Chen, Yeali S. Sun, Meng Chang Chen

Abstract:
Cloud security becomes an important topic in recent years, as to overcome the botnet in a visualized environment is a critical task for the cloud providers. Although numerous intrusion detection systems are available, yet it is not practical to install IDS in every virtual machine. In this paper, we argue that a virtual machine monitor (VMM) can support certain security functions that our proposed design can actively collect information directly from the VMM without installing an agent in the guest OS. In addition, bot could not aware of the existence of such detection agent in the VMM. The proposed detection mechanism takes both passive and active detection approaches that the passive detection agent lies in the VMM to examine the tainted data used by a bot to check against bot behavior profiles and the active detection agent that performs active bot fingerprinting can actively send specific stimulus to a guest and examine if there exists expected triggered behavior. In the real-world bot experiments, we show the passive detection agent can distinguish between bots and benign process with low false positive and false negative rates. Also, the result shows the active detection agent can detect a bot even when before it performs its malicious jobs. The proposed mechanism suites an enterprise having cloud environment well to defeat malware.
Current Research Results
Authors: Jhih-Siang Lai, Cheng-Wei Cheng, Allan Lo*, Ting-Yi Sung*, and Wen-Lian Hsu

Abstract:
Background: Since membrane protein structures are challenging to crystallize, computational approaches are essential for elucidating the sequence-to-structure relationships. Structural modeling of membrane proteins requires a multidimensional approach, and one critical geometric parameter is the rotational angle of transmembrane helices. Rotational angles of transmembrane helices are characterized by their folded structures and could be inferred by the hydrophobic moment; however, the folding mechanism of membrane proteins is not yet fully understood. The rotational angle of a transmembrane helix is related to the exposed surface of a transmembrane helix, since lipid exposure gives the degree of accessibility of each residue in lipid environment. To the best of our knowledge, there have been few advances in investigating whether an environment descriptor of lipid exposure could infer a geometric parameter of rotational angle.

Results: Here, we present an analysis of the relationship between rotational angles and lipid exposure and a support-vector-machine method, called TMexpo, for predicting both structural features from sequences. First, we observed from the development set of 89 protein chains that the lipid exposure, i.e., the relative accessible surface area (rASA) of residues in the lipid environment, generated from high-resolution protein structures could infer the rotational angles with a mean absolute angular error (MAAE) of 46.32°. More importantly, the predicted rASA from TMexpo achieved an MAAE of 51.05°, which is better than 71.47° obtained by the best of the compared hydrophobicity scales. Lastly, TMexpo outperformed the compared methods in rASA prediction on the independent test set of 21 protein chains and achieved an overall Matthew’s correlation coefficient, accuracy, sensitivity, specificity, and precision of 0.51, 75.26%, 81.30%, 69.15%, and 72.73%, respectively. TMexpo is publicly available at http://bio-cluster.iis.sinica.edu.tw/TMexpo.

Conclusions: TMexpo can better predict rASA and rotational angles than the compared methods. When rotational angles can be accurately predicted, free modeling of transmembrane protein structures in turn may benefit from a reduced complexity in ensembles with a significantly less number of packing arrangements. Furthermore, sequence-based prediction of both rotational angle and lipid exposure can provide essential information when high-resolution structures are unavailable and contribute to experimental design to elucidate transmembrane protein functions.
Current Research Results
Authors: Tsai, Z.T.Y., Chu, W.Y., Cheng, J.H., and Tsai, H.K.*

Abstract:
Non-B DNA structures are abundant in the genome and are often associated with critical biological processes, including gene regulation, chromosome rearrangement and genome stabilization. In particular, G-quadruplex (G4) may affect alternative splicing based on its ability to impede the activity of RNA polymerase II. However, the specific role of non-B DNA structures in splicing regulation still awaits investigation. Here, we provide a genomewide and cross-species investigation of the associations between five non-B DNA structures and exon skipping. Our results indicate a statistically significant correlation of each examined non-B DNA structures with exon skipping in both human and mouse. We further show that the contributions of non-B DNA structures to exon skipping are influenced by the occurring region. These correlations and contributions are also significantly different in human and mouse. Finally, we detailed the effects of G4 by showing that occurring on the template strand and the length of G-run, which is highly related to the stability of a G4 structure, are significantly correlated with exon skipping activity. We thus show that, in addition to the well-known effects of RNA and protein structure, the relative positional arrangement of intronic non-B DNA structures may also impact exon skipping.
Current Research Results
"What Distinguish One from Its Peers in Social Networks," Data Mining and Knowledge Discovery, November 2013.
Authors: Yi-Chen Lo, Jhao-Yin Li, Mi-Yen Yeh, Shou-De Lin, Jian Pei

Abstract:
Being able to discover the uniqueness of an individual is a meaningful task in social network analysis. This paper proposes two novel problems in social network analysis: how to identify the uniqueness of a given query vertex, and how to identify a group of vertices that can mutually identify each other. We further propose intuitive yet effective methods to identify the uniqueness identification sets and the mutual identification groups of different properties. We further conduct an extensive experiment on both real and synthetic datasets to demonstrate the effectiveness of our model.
Current Research Results
"Assembler for de novo assembly of large genomes," Proceedings of the National Academy of Sciences, September 2013.
Authors: Te-Chin Chu, Chen-Hua Lu, Tsunglin Liu, Greg C. Lee, Wen-Hsiung Li, and Arthur Chun-Chieh Shih

Abstract:
"FinancialCloud: Open Cloud Framework of Derivative Pricing," Proceeding International Conference on Economic Computing, September 2013.
Authors: Hsin-Tsung Peng, William W.Y. Hsu, Feipei Lai, and Jan-Ming Ho

Abstract:
Predicting prices and risk measures of assets and derivatives and rating of financial products have been studied and widely used by financial institutions and individual investors. In contrast to the centralized and oligopoly nature of the existing financial information services, in this paper, we advocate the notion of a FinancialCloud, i.e., an open distributed framework based cloud computing to host modularize financial services such that these modularized financial services may easily be integrated on demand to meet users’ needs on demand. This new cloud-based architecture of modularized financial services provides several advantages. We may have different types of service provides in the ecosystem on top of the framework. For example, market data resellers may collect and sell a long-term historical market data. Statistical analyses of macroeconomic indices, interest rates, and correlation of a set of assets may also be purchased online. Some agencies might be interested in providing services based on rating or pricing values of financial products in the future. Traders may use the statistically estimated parameters to fine-tune their trading algorithm to maximize the profit of his customers Providers of each service module may focus on effectiveness, performance, robustness, and security of their innovative products. On the other hand, a user pays for exactly what one uses to optimally manage their assets. A user may also acquire services through an online agent who is an expert in assessing the structural model and quality of existing products and thus assembles service modules matching user’s risk taking behavior. In this paper, we’ll also make a survey of relating existing technologies and a prototype we developed so far.
Current Research Results
"A Reliability Enhancement Design under the Flash Translation Layer for Flash-Memory Storage Systems," ACM Transactions on Embedded Computing Systems, August 2013.
Authors: Yuan-Hao Chang, Ming-Chang Yang, Tei-Wei Kuo, and Ren-Hung Hwang

Abstract:
Although flash memory has gained very strong momentum in the storage market, the reliability of flashmemory chips has been dropped significantly in the past years. This article presents a reliability enhancement design under the flash management layer (i.e., flash translation layer) to address this concern so as to reduce the design complexity of flash-memory management software/firmware and to improve the maintainability and portability of existing and future products. In particular, a log-based write strategy with a hash-based caching policy is proposed to provide extra ECC redundancy and performance improvement. Strategies for bad block management are also presented. The failure rate of flash-memory storage systems is analyzed with the considerations of bit errors. The proposed design is later evaluated by a series of experiments based on realistic traces. It was shown that the proposed approach could significantly improve the reliability of flash memory with very limited system overheads.
"Responsive Alert Delivery over IP Network," Proceedings of IEEE International Conference on Cyber-Physical Systems, Networks and Applications, July 2013.
Authors: Y. Z. Ou, C. M. Huang, C. T. Hu, E. T. H. Chu, C. S. Shih, J. W. S. Liu

Abstract:

Intelligent Guards against Disasters (iGaDs) are devices and applications deployed pervasively in future smart homes and environments. By receiving and processing standard-based machine-readable disaster alert messages from authorized senders and taking appropriate actions in response, they aim to help us better cope with natural disasters. The focus of this paper is on responsive delivery of alert messages over an IP network to a huge number of embedded iGaDs.  The Asynchronous Message Delivery Service (AMeDS) is designed for this purpose. After presenting an overview of iGaDs and AMeDs, the paper describes an end-to-end scheduling mechanism used by AMeDS and discusses performance limitations in pushing time-critical alerts to iGaDs over the Internet.

Current Research Results
"A Linked-Data Based Virtual Repository for Disaster Management Tools and Applications," WIT Transactions on the Built Environment, July 2013.
Authors: Y. Z. Ou, S. H. Tsai, Y. A. Lai, J. Su, C. W. Yu, C. T. Hsiao, E. T.-H. Chu, K. J. Lin, J. M. Ho and J. W. S. Liu

Abstract:

Nowadays, developed regions in the world, including Taiwan, have a wealth of data and information, which if made available in time for disaster preparedness and response purposes, can help save lives and minimize damages. With a few exceptions, however, state-of-the-art data management information systems (DMIS) available in most countries do not provide adequate support for search, discovery, access and use of data and information residing in independent sources across institutional boundaries. This paper describes architecture and design of a distributed middleware-level framework, called Virtual Repository (VR), together with functionalities and structures of its key components. By leveraging linked data and related technologies and tools, VR aims to eliminate this limitation. Two applications, Mobile Assistant for Disasters (MAD) and Automatic Disaster Alert System for Tourists (ADAST) are described to demonstrate the feasibility and effectiveness of VR.

"An Asynchronous Message Delivery Service for iGaDs (intelligent Guards against Disasters)," Proceedings of International Workshop on Extending Seamlessly to the Internet of Things, July 2013.
Authors: Y. Z. Ou, C. M. Huang, C. T. Hu, E. T. H. Chu, C. S. Shih, J. W. S. Liu

Abstract:

This paper describes architecture and middleware for a system of intelligent Guards against Disasters, called iGaDs for short. iGaDs are smart devices and applications that can receive, authenticate and process standard-conforming disaster alert messages from authorized senders and respond by taking appropriate actions to help us to be better prepared for nature disasters. They are designed to be used ubiquitously as elements of future disaster-prepared smart homes and environments. The prototype prioritized asynchronous alert message delivery service described here is built by using a data bridge to connect Qpid and PubSubHubbub as a way to push alert messages to a large system of iGaDs via Internet.

"A Systematic Methodology for OS Benchmark Characterization," ACM Research in Adaptive and Convergent Systems (RACS), October 2013.
Authors: Shuo-Hung Chen, Hsiao-Mei Lin, Kuo-Yi Chen, Yuan-Hao Chang, Pen-Chung Yew, and Chien-Chung Ho

Abstract:
Using benchmarks to evaluate operating systems is a common and important approach. However, determining which benchmarks to use for such evaluation requires very careful consideration. It has been found that a seemingly naive change of system configuration or input set could lead to drastic change of benchmark characteristics, and could also lead to misleading or incorrect results. Some OS benchmark suites may also include too many benchmark programs with very similar characteristics that could give biased results against, or in favor of, certain kernel behavior. Hence, we need to determine the characteristics of benchmark programs in order to come up with an appropriate benchmark suite for such evaluation, and to interpret the measured results more precisely and correctly. Although there have been many tools developed to help profiling an OS and to characterize its run-time behavior, the collected data by those tools are often very large and complex. It is extremely time consuming, labor intensive, and error prone to analyze the large volume of measured results, and to determine the characteristics of a suite of benchmark programs. In this work, we propose to use machine-learning techniques to help analyzing and characterizing OS benchmark programs based on the traced OS kernel events. In the work, a systematic methodology is proposed to automatically characterize benchmarks. We found that the characterized OS behavior could help developers to choose appropriate applications to benchmark operating systems.
"Non-Reference Audio Quality Assessment for Online Live Music Recordings," Proc. ACM International Conference on Multimedia (MM), October 2013.
Authors: Zhonghua Li, Ju-Chiang Wang, Jingli Cai, Zhiyan Duan, Hsin-Min Wang and Ye Wang

Abstract:
Immensely popular video sharing websites such as YouTube have become the most important sources of music information for Internet users and the most prominent platform for sharing live music. The audio quality of this huge amount
of live music recordings, however, varies significantly due to factors such as environmental noise, location, and recording device. However, most video search engines do not take audio quality into consideration when retrieving and ranking results. Given the fact that most users prefer live music videos with better audio quality, we propose the first automatic, non-reference audio quality assessment framework for live music video search online. We first construct two annotated datasets of live music recordings. The first dataset contains 500 human-annotated pieces, and the second contains 2,400 synthetic pieces systematically generated by adding noise effects to clean recordings. Then we formulate the assessment task as a ranking problem and try to solve it using a learning-based scheme. To validate the effectiveness of our framework, we perform both objective and subjective evaluations. Results show that our framework significantly improve the ranking performance of live music recording retrieval and can prove useful for various real-world music applications.
"Joint Management of Performance-predictable Virtualized Storage Devices with Hard Disk and Flash Memory," ACM Research in Adaptive and Convergent Systems (RACS), October 2013.
Authors: Po-Chun Huang, Yuan-Hao Chang, Tei-Wei Kuo, Chien-Chung Ho, and Hyunseung Choo

Abstract:
Recently, the significance of storage virtualization has been highlighted due to the growing performance demands of next-generation applications. However, with its unpredictable long seek time and rotational delays, hard disk fails to provide sufficient performance guarantees to fulfill the service-level objectives of the applications, especially on random accesses. To resolve the random-access problem of hard disk, it is a common approach to adopt flash memory in a hard-disk-based storage system. Nevertheless, flash memory requires garbage collection to reclaim the space with the invalid data, where the garbage collection process might introduce long and unpredictable time overheads and considerably reduce the performance benefits of flash memory. This work proposes a joint management strategy for virtualized storage systems with hard disk and flash memory. The major idea of this work is, to exploit flash memory to enhance the random access performance of hard disk, while exploiting hard disk to aid flash memory in avoiding the worst-case delays incurred by garbage collection. Analytical and experimental studies showed that the resulted virtualized storage device shows surprising behaviors of the worst-case performance higher than both hard disk and flash memory.
"Spatial Search for K Diverse-Near Neighbors," ACM Conference on Information and Knowledge Management (ACM CIKM), October 2013.
Authors: G. Ference, W.-C. Lee, H.-J. Hung, and D.-N. Yang

Abstract:
To many location-based service applications that prefer diverse results, finding locations that are spatially diverse and close in proximity to a query point (e.g., the current location of a user) can be more useful than finding the nearest neighbors/locations. In this paper, we investigate the problem of searching for the Diverse-Near Neighbors (DNNs) in spatial space that is based upon the spatial diversity and proximity of candidate locations to the query point. While employing a conventional distance measure for proximity, we develop a new and intuitive diversity metric based upon the variance of the angles among the candidate locations with respect to the query point. Accordingly, we create a dynamic programming algorithm that _finds the optimal DNNs. Unfortunately, the dynamic programming algorithm, with a time complexity of O(kn^3), incurs excessive computational cost. Therefore, we further propose two heuristic algorithms, namely, Distance-based Browsing (DistBrow) and Diversity-based Browsing (DivBrow) that provide high effectiveness while being efficient by exploring the search space prioritized upon the proximity to the query point and spatial diversity, respectively. Using real and synthetic datasets, we conduct a comprehensive performance evaluation. The results show that DistBrow and DivBrow have superior effectiveness compared to state-of-the-art algorithms while maintaining high efficiency.
"Sampling-Based Phase Classification and Prediction for Multi-threaded Program Execution," 42th International Conference on Parallel Processing (ICPP), September 2013.
Authors: Chin-Hao Chang, Jan-Jan Wu, Pangfeng Liu

Abstract:
Program executions are usually repetitive; hence, we can optimize program executions if we are able to identify the program’s repetition pattern. If we can accurately classify program execution intervals into phases, and use such information to accurately predict the next phase at runtime, we will be able to apply suitable optimization on the subsequent intervals. Such runtime optimization opportunity not only exists in single-threaded programs but also in multithreaded programs In this paper, we propose a framework that collects code signature of each interval from individual threads of a multi-threaded parallel program, and classify them into phases at runtime. We then use the classification results to predict the next phase in an on-line fashion. In order to efficiently classify execution intervals on-line, we only keep a fixed number of representative data in a shared table for classification purpose. The classification is very efficient and uses less than 3K bytes of memory. We also propose a confidence table to improve the prediction rate. Our experiment results with the Spec OMP2001 and the PARSEC benchmark suites on an Intel Xeon multi-core show that our system can classify intervals into phases with high homogeneity, can predict the next phase with 65% accuracy without using confidence table and with 80% accuracy when confidence table is used.
"A Disturb-Alleviation Scheme for 3D Flash Memory," ACM/IEEE International Conference on Computer-Aided Design (ICCAD), November 2013.
Authors: Yu-Ming Chang, Yuan-Hao Chang, Tei-Wei Kuo, Hsiang-Pang Li, and Yung-Chun Li

Abstract:
Even though 3D flash memory presents a grand opportunity for huge-capacity non-volatile memory, it suffers from serious program disturb problems. Different from the past efforts in error correction codes or the work in trading the space utilization with reliability, we propose a disturb-alleviation scheme that can alleviate the negative effects caused by program disturb, especially inside a block, without introducing extra overheads on encoding or storing of extra redundant data. In particular, a methodology is proposed to reduce the data error rate by distributing unavoidable disturb errors over the flash-memory space of invalid data, with the considerations of the physical organization of 3D flash memory. A series of experiments was conducted based on real multi-layer 3D flash chips, and it showed that the proposed scheme could significantly enhance the reliability of 3D flash memory.
"Size Does Matter: How Does Image Display Size Affect Aesthetic Perception?," Proceedings of ACM Multimedia 2013, October 2013.
Authors: Wei-Ta Chu, Yu-Kuang Chen, and Kuan-Ta Chen

Abstract:
It is undoubtedly that an image's content determines how people assess the image's aesthetic level. Previous works have shown that image contrast, saliency features, and the composition of objects may jointly decide whether an image looks good or not. In addition to the "content" of an image, however, the way an image is "presented" to viewers may also affect how much it is appreciated. For example, one might expect a picture always looks better when it is displayed in a larger size. Is this "the-bigger-the-better" rule always true? If not, under what situations it becomes invalid?

This paper devotes to analyze how an image's resolution (pixels) and physical dimension (inches) affect how much viewers appreciate this image. Based on a large-scale aesthetic assessments of 100 images displayed in a variety of resolutions and physical dimensions, we show that an image's display size significantly affects its aesthetic rating in a complicated way; normally a picture looks better with a bigger display size, but it may look relatively worse depending on its content. We develop a set of regression models to predict a picture's absolute and relative aesthetic levels at a given display size based on its content and compositional features, and, simultaneously, we analyze the essential features that lead to the size-dependent property of image aesthetics. We hope that this work will shed some light on future research by revealing that both content and presentation should be considered in image aesthetics evaluation.
Current Research Results
"Multicast with Intra-Flow Network Coding in Multi-Rate Multi-Channel Wireless Mesh Networks," IEEE Trans. on Vehicular Technology, November 2013.
Authors: C.-J Lin and D.-N. Yang

Abstract:
It has been shown that intra-flow network coding can improve the multicast throughput in single-channel single rate wireless mesh networks. Most existing studies focus on developing a practical intra-flow network-coding multicast protocol. With the ripe of the wireless products supporting multi-rate multi-channel communications, nowadays it becomes increasingly important to improve the network coding gain by utilizing multiple bit-rates and non-overlapping channels, but new challenges also arise due to various tradeoffs. In a multi-rate, multi-channel, multi-hop network, overhearing opportunities and bandwidth utilization are highly correlated to the selection of the channel and rate, which, in turn, determines the achievable throughput of multicast with intra-flow network coding. Specifically, wireless nodes can transmit at higher bit-rates over orthogonal channels to increase the forwarding throughput, but the strategy reduces overhearing opportunities and the coding gain in multicast with intra-flow network coding. To achieve the best trade-off between the above two aspects, we formulate a new optimization model called multi-Rate multi-Channel Multicast with intra-flow Network Coding (RCMNC), which solves the joint channel assignment, rate selection and flow allocation problems for multi-hop intraflow network coding multicast. We then reformulate the primal problem as the Lagrangian dual problem, and propose an algorithm to approximate the solution of the dual problem in polynomial-time. Simulation results show that, by assigning a suitable rate and channel to each network interface, multicast throughput with intra-flow network coding under RCMNC is up to 3.3 times higher than that achieved by the existing approach in a multi-rate multi-channel wireless mesh network.
Current Research Results
"Structural Diversity for Resisting Community Identification in Published Social Networks," IEEE Trans. on Knowledge and Data Engineering, October 2013.
Authors: C.-H. Tai, P. S. Yu, D.-N. Yang, and M.-S. Chen

Abstract:
Recently, the privacy issues about the individuals in the social networks have become serious concerns. Vertex identification is one of the most important problems that have been addressed. In reality, however, each individual in a social network is inclined to be associated with not only a vertex identity but also a community identify, which could represent the personal privacy information sensitive to the public, such as the political party affiliation. This paper first addresses the new privacy issue, referred to as community identification, by showing that the community identity of a victim can still be inferred even though the social network is protected by existing anonymity schemes. For this problem, we then propose the concept of structural diversity to provide the anonymity of the community identities. The k-Structural Diversity Anonymization (k-SDA) is to ensure sufficient vertices with the same vertex degree in at least k communities in a social network. We propose an Integer Programming formulation to find optimal solutions to k-SDA and also devise scalable heuristics to solve large-scale instances of k-SDA with different perspectives. The performances studies on real data sets from various perspectives demonstrate the practical utility of the proposed privacy schema and our anonymization approaches.
"Maximizing Acceptance Probability for Active Friending in On-Line Social Networks," ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM KDD), August 2013.
Authors: D.-N. Yang, H.-J. Hung, W.-C. Lee, and W. Chen

Abstract:
Friending recommendation has successfully contributed to the explosive growth of on-line social networks. Most friending recommendation services today aim to support passive friending, where a user passively selects friending targets from the recommended candidates. In this paper, we advocate recommendation support for active friending, where a user actively specifies a friending target. To the best of our knowledge, a recommendation designed to provide guidance for a user to systematically approach his friending target, has not been explored in existing on-line social networking services. To maximize the probability that the friending target would accept an invitation from the user, we formulate a new optimization problem, namely, Acceptance Probability Maximization (APM), and develop a polynomial time algorithm, called Selective Invitation with Tree and In-Node Aggregation (SITINA), to find the optimal solution. We implement an active friending service with SITINA in Facebook to validate our idea. Our user study and experimental results manifest that SITINA outperforms manual selection and the baseline approach in solution quality efficiently.
"A Fifty-percent Rule to Minimize the Energy Consumption of PCM-based Storage Systems," IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), August 2013.
Authors: Ming-Chang Yang, Martin Kuo, Che-Wei Tsao, and Yuan-Hao Chang

Abstract:
In recent years, phase-change memory (PCM) has drawn a lot of attention because of its byte-addressability and non-volatility. It has become a good alternative storage medium to reduce the performance gap between main memory and secondary storage, but its high energy consumption on writes is a challenging issue in the design of battery-powered mobile computing systems. By utilizing the byte-addressability and the asymmetric read-write energy/latency of PCM, we propose an energy-efficient update scheme with a fifty-percent rule for journaling file systems to reduce the energy consumption. This scheme only writes the modified data, instead of the whole updated block, to PCM-based storage devices with the guarantee of the sanity/integrity of file systems even if the system crashes or power failure occurs during the process of data updates. A series of experiments based on the implementation on the Linux system was conducted to evaluate the capability of the proposed scheme,  and the results are very encouraging.
Current Research Results
"Coding-Aware Peer-to-Peer Data Repair in Multi-Rate Wireless Networks – A Game Theoretic Analysis," IEEE Journal on Selected Areas in Communications (JSAC), September 2013.
Authors: Hsiao-Chen Lu, Wanjiun Liao, Meng Chang Chen, and Musaed A. Alhussein

Abstract:
Recent research shows that in wireless wide area networks (WWANs), users who subscribe to multicast traffics from the WWAN can exchange network-coded packets with one another via their secondary radio interfaces such as Wi-Fi in order to efficiently recover lost packets from the WWAN. Different from existing works which assume users are cooperative, in this work, we model the users as selfish players in the network-coding based peer-to-peer packet repairing game. To stimulate the users’ cooperation, we introduce a payment-based incentive mechanism in the packet repairing game. The utility function of a user/player is also formulated to reflect both the number of useful packets and the available resource. Through analysis of the packet repairing game, we show that the optimal strategy for a user can be derived only with its local information. Moreover, the impact of the pricing rules and the convergence conditions of the packet repairing game have also been analyzed. We show theoretically as well as by simulation that under proper conditions, the packet repairing game can converge to the best case where each user can acquire all of its missing packets. Via computer simulations, we also show that with the proposed selection criteria, the packet repairing game is both effective and efficient; not only can the utilities of the players be greatly improved, but also the convergence time of the game and the utility gain of the players are comparable to those of the ideal case where each user is always willing to forward packets to others.