2010 Technical Report

Code

TR-IIS-10-001

Subject / Author / Abstract

Protein-Protein Interface Prediction based on a Novel SVM Speedup
Tsu-shu Tseng, Chien-Yang Guo, Wen-Lian Hsu, and Fu Chang

Protein-protein interactions play a crucial role in many cellular processes. Prediction of amino acid residues that appear in interaction sites helps decipher protein functions. Since a significant number of complexes have large enough interfaces, we hypothesize that the complex formation follows the induced-fit mechanism rather than the lock-and-key mechanism. Therefore, one should be able to characterize interface regions by frequent appearances of unstructured or flexible amino acid residues in those regions. For this residue prediction problem, we designed a novel method called "tree decomposition support vector machine" (TDSVM) that can handle large samples. Previously, the sizes of protein chains used as training data were generally in the scope of hundreds, whereas TDSVM extends the number to thousands (4,064 in our case), which yields more than a million samples, represented as feature vectors. Using TDSVM to speed up the training of kernel-based support vector machines (SVMs), at a factor of nearly 300, we were able to perform numerous experiments efficiently to optimize the parameters and feature selection that would otherwise take months. As a result, we achieved prediction outcomes with substantially high scores in F1-measure and Matthews correlation coefficient (MCC) using only protein-sequence information.

View

Fulltext

Code

TR-IIS-10-002

Subject / Author / Abstract

Heterogeneous Subset Sampling
Meng-Tsung Tsai, Da-Wei Wang, Churn-Jung Liau, Tsan-sheng Hsu

In this paper, we consider the problem of heterogeneous subset sampling. Each element in a domain set has different probabilities of being included in a sample, which is a subset of the domain set. Drawing a sample from a domain set of size $n$ takes $O(n)$ time if a Naive algorithm is employed. We propose a Hybrid algorithm, which requires $O(n)$ preprocessing time and $O(n)$ extra space. It draws a sample in $O(n \sqrt{p^{*}})$ time on average where $p^{*}$ is $\min{(p_{\mu}, 1-p_{\mu})}$ and $p_{\mu}$ denotes the mean of inclusion probabilities. In addition to the theoretical analysis, we evaluate the performance of the Hybrid algorithm via experiments and give an application for particle-based simulations on the spread of a disease.

View

Fulltext

Code

TR-IIS-10-003

Subject / Author / Abstract

Recognition of Blurred License Plate Images
Pei-Lun Hsieh, Yu-Ming Liang, Hong-Yuan Mark Liao

We propose a systematic way to perform blurred license plate image recognition. Our method only uses one license plate image and there is no need to perform character segmentation. There are three main steps involved in the proposed system. First, we perform character position identification and corresponding character list estimation using single-character templates. Then, the position of special symbol on a license plate is estimated. Finally, we expand the templates from single-character templates to multiple-character templates for refining the recognition results. The experiment results show that our method is superb in recognizing characters of blurred license plate images.

View

Fulltext

Code

TR-IIS-10-004

Subject / Author / Abstract

Functional Specifications of the ezTrial Clinical Trial Information Management System
Hong-Yi Chen , Chung-Tsuo Lin, Chia-Hui Chang, Yi-Ru Cian , Yi-Fang Lee, and Chun-Nan Hsu

View

Fulltext

Code

TR-IIS-10-005

Subject / Author / Abstract

User Guide of the ezTrial (beta) Clinical Trial Information Management System
Yi-Ru Cian , Chung-Tsuo Lin , Hong-Yi Chen, Chia-Hui Chang, Yi-Fang Lee, and Chun-Nan Hsu

View

Fulltext

Code

TR-IIS-10-006

Subject / Author / Abstract

Homomorphic Encryption-based Secure SIFT for Privacy-Preserving Feature Extraction
Chao-Yung Hsu, Chun-Shien Lu, Soo-Chang Pei

Privacy has received much attention but is still largely ignored in the multimedia community. Consider a cloud computing scenario, where the server is resource-abundant and is capable of finishing the designated tasks, it is envisioned that secure media retrieval and search with privacy-preserving will be seriously treated. In view of the fact that scale-invariant feature transform (SIFT) has been widely adopted in various fields, this paper is the first to address the problem of secure SIFT feature extraction and representation in the encrypted domain. Since all the operations in SIFT must be moved to the encrypted domain, we propose a homomorphic encryption-based secure SIFT method for privacy-preserving feature extraction and representation based on Paillier cryptosystem. In particular, homomorphic comparison is a must for SIFT feature detection but is still a challenging issue for homomorphic encryption methods. To conquer this problem, we investigate a quantization-like secure comparison strategy in this paper. Experimental results demonstrate that the proposed homomorphic encryption-based SIFT performs comparably to original SIFT on image benchmarks, while preserving privacy additionally. We believe that this work is an important step toward privacy-preserving multimedia retrieval in an environment, where privacy is a major concern.

View

Fulltext

Code

TR-IIS-10-007

Subject / Author / Abstract

Compressing Trajectories Using Inter-Frame Coding
Cheng-Yu Lin, Hung-Chia Chen, Ying-Yu Chen, Wang-Chien Lee, and Ling-Jyh Chen

With the advances in wireless communications and GPS technology, there is increasing interest in the field of location-aware services. Because of the proliferation of GPS-enabled devices and applications, in this study, we address the scalability issue in trajectory data management. Specifically, we propose a scheme called Inter-Frame Coding (IFC) for lossless compression of trajectory data, and implement two classical database queries based on the scheme. Evaluations of the IFC scheme using real trajectory datasets show that it can achieve a compression ratio of 58%. Moreover, it can reduce the computational complexity of range queries by a factor of 0.45, while maintaining an acceptable execution time in k-nearest neighbor searches. The IFC scheme is simple, efficient, and lossless; thus, it has the potential to facilitate trajectory-based data storage, compression, and computation.

View

Fulltext

Code

TR-IIS-10-008

Subject / Author / Abstract

A Model and Simulation Environment for Symbiotic Automation and Assistive Devices
T. Y. Chen, P. H. Tsai, C. H. Chen, C. W. Yu, C.S. Shih and J. W. S. Liu

This paper describes the UCAADS simulation environment and the underlying UCAADS model that have been developed for the purpose of evaluating the correctness and performance of UCAADS and user-device interactions. The acronym UCAADS stands for user-centric automation and assistive devices/systems and services. Examples of UCAADS are medication dispenser, smart pantry, robotic housekeeping aids, and mobility assistants. UCAADS model combines two types of modeling elements: workflow model and GOMS model. The behavior specification of a symbiotic system of device and its user(s) as a whole consists of specifications of device operations, user actions and user-device interactions. They are defined in terms of workflows and are executable. The incorporation of GOMS model with workflows enables us to account for different behavior and skill levels of different users in the estimation of execution times of their actions. As case studies, we modeled and simulated parts of three UCAADS: smart medication dispenser for home use, smart storage pantry and multi-user medication station. These devices require their users to carry out mission-critical operations. Our simulation experimentations and the results demonstrate that the UCAADS model and USE are effective in helping us discover and fix design and implementation errors that allow incorrect user-device interactions, in addition to assessing the responsiveness of devices.

View

Fulltext

Code

TR-IIS-10-009

Subject / Author / Abstract

Programming from Galois Connections - Principles and Applications
Shin-Cheng Mu and José Nuno Oliveira

Problem statements often resort to superlatives such as in eg. "... the smallest such number", "...the best approximation", "... the longest such list" which lead to specifications made of two parts: one defining a broad class of solutions (the easy part) and the other requesting one particular such solution optimal in some sense (the hard part). This report introduces a binary relational combinator which mirrors this linguistic structure and exploits its potential for calculating programs by optimization. This applies in particular to specifications written in the form of Galois connections, in which one of the adjoints delivers the optimal solution. The framework encompasses re-factoring of results previously developed by by Bird and de Moor for greedy and dynamic programming, in a way which makes them less technically involved and therefore easier to understand and play with.

View

Fulltext

Institute of Information Science, Academia Sinica

Library

2010 Technical Report