標題 / 作者 / 摘要
Construction of Gene Clusters Resembling Genetic Causal Mechanisms for Common Complex Disease with an Application to Young-Onset Hypertension
Ke-Shiuan Lynn, Chen-Hua Lu, Han-Ying Yang, Wen-Lian Hsu and Wen-Harn Pan
Motivation: Lack of power and reproducibility are caveats of genetic association studies of common complex diseases. Indeed, the heterogeneity of disease etiology demands that causal models consider the simultaneous involvement of multiple genes. Rothman’s sufficient-cause model, which is well known in epidemiology, provides a framework for such a concept. In the present work, we developed a three-stage algorithm to construct gene clusters resembling Rothman’s causal model for a complex disease, starting from finding influential gene pairs followed by grouping homogeneous pairs.
Result: The algorithm was trained and tested on 2,772 hypertensives and 6,515 normotensives extracted from four large Caucasian and Taiwanese databases. The constructed clusters, each featured by a major gene interacting with many other genes and identified a distinct group of patients, reproduced in both ethnic populations and across three genotyping platforms. We present the 14 largest gene clusters which were capable of identifying 19.3% of hypertensives in all the datasets and 41.8% if one dataset was excluded for lack of phenotype information. Although a few normotensives were also identified by the gene clusters, they usually carried less risky combinatory genotypes (insufficient causes) than the hypertensive counterparts. After establishing a cut-off percentage for risky combinatory genotypes in each gene cluster, the 14 gene clusters achieved a classification accuracy of 82.8% for all datasets and 98.9% if the information-short dataset was excluded. Furthermore, not only 9 of the 14 major genes but also many other contributing genes in the clusters are associated with hypertension-related functions. Our results provide insights into polygenic aspect of hypertension etiology.
Availability: Supplementary Data Files and MATLAB files that generate Figs. 3-5 are available at http://ms.iis.sinica.edu.tw/genetic_causal_pies/index.htm.
Contact: firstname.lastname@example.org or email@example.com
Keywords: genetic causal pie, sufficient cause, data-mining, young-onset hypertension, complex disease
標題 / 作者 / 摘要
A Dynamic Binary Translation System in a Client/Server Environment
Ding-Yong Hong, Chun-Chen Hsu, Chao-Rui Chang, Jan-Jan Wu, Pen-Chung Yew, Wei-Chung Hsu, Pangfeng Liu, Chien-Min Wang, Yeh-Ching Chung.
With rapid advances in mobile computing, multi-core processors and expanded memory resources are being made available in new mobile devices. This trend will enable a wider range of existing applications to be migrated to mobile devices, for example, running desktop applications in IA-32 (x86) binaries on ARM-based mobile devices transparently using dynamic binary translation (DBT). However, the overall performance could significantly affect the energy consumption of the mobile devices because it is directly linked to the number of instructions executed and the overall execution time of the translated code. Hence, even though the capability of today’s mobile devices will continue to grow, the concern over anslation efficiency and energy consumption will put more constraints on a DBT for mobile devices, in particular, for thin mobile clients than that for severs. With increasing network accessibility and bandwidth in various environments, it makes many network servers highly reachable to thin mobile clients. Those network servers are usually equipped with a substantial amount of resources. This opens an opportunity for DBT on thin clients to leverage such powerful servers. However, designing such a DBT for a client/server environment requires many critical considerations.
In this work, we looked at those design issues and developed a distributed DBT system based on the client/server model. We proposed a DBT system that consists of two dynamic binary translators. An aggressive dynamic binary translator/optimizer to serve the translation/optimization requests from thin clients are run on the server. A thin DBT that executes light-weight binary translation and basic emulation functions is run on each thin client. With such a two-translator client/server approach, we successfully off-load the DBT overhead of the thin client to the server and achieve significant performance improvement over the non-client/server model. Experimental results show that the DBT of the client/server model could achieve 14% speedup over that of non-client/server model for x86-32 to ARM emulation using SPEC CINT2006 benchmarks with test inputs and are only 3.4X and 2.2X slower than the native execution with test and reference inputs, respectively, as opposed to 7.1X and 5.1X slow-down on QEMU.
標題 / 作者 / 摘要
Improving Region SelectionThrough Early-Exit Detection
Chun-Chen Hsu, Pangfeng Liu, Jan-Jan Wu, Chien-Min Wang, Ding-Yong Hong, Wei Chung Hsu
Many dynamic binary translation (DBT) systems and just-in-time compilers target traces, i.e. frequently-taken execution paths, as code regions to be translated/optimized. The Next-Tail-Execution (NET) trace selection method used in HP Dynamo is an early example of such techniques. Many current trace optimization schemes are actually variations of NET. These NET-like trace optimizations work very well for most traces, but they also suffer the same problem: the selected traces may contain a large number of early exits that could branch out in the middle of traces. If early exits are taken frequently during program execution, the benefit of trace optimization could be lost due to the overhead of costly compensation code in the trace epilogue. We refer to traces/regions with frequently taken earlyexits as delinquent traces/regions. Our empirical study shows that at least 9 of the 12 SPEC CPU2006 integer benchmarks have delinquent traces, i.e., if we use NET to select traces, each of these nine benchmarks will take more than 100 early exits per million executed instructions in their traces.
In this paper, we significantly improve the performance of NET by merging delinquent traces into larger code regions. We propose a light-weight region formation technique called Early-Exit Guided region selection (EEG)to improve the performance by iteratively detecting and merging delinquent regions into larger code regions. Hardware assisted dynamic profiling is first used to identify hot code regions without incurring significant runtime overhead. Key software counters are then instrumented at the exit points of the hot regions to detect early exits. When a counter exceeds certain threshold value, the code region that begins at the branch target of that early exit is merged into the main code region. We also employ a heuristic to decide whether it is beneficial to merge the selected regions or not. We will not merge two regions if the cost of spill code is too high for the merged code.
We implement our EEG algorithm in two LLVM-based parallel dynamic binary translators. These two parallel dynamic binary translators are for ARM and IA32 instruction set architecture (ISA) respectively, and both use multiple compilation threads to compile different code regions concurrently. We evaluate the performance of EEG with two benchmark suites: the SPEC CPU2006 single-threaded benchmark suite with reference inputs, and the PARSEC multi-threaded benchmarks with na- tive inputs. The experimental results show that, compared to NET, EEG can achieve a performance improvement of up to 67% (13% on average) for SPEC CPU2006 integer benchmarks, and up to 20% (10% on average) for PARSEC multi-threaded benchmarks.
標題 / 作者 / 摘要
Ubiquitous Smart Devices and Applications for Disaster Preparedness
W. P. Liao, Y. Z. Ou, E. T. H. Chu, C. S. Shih and J. W. S. Liu
標題 / 作者 / 摘要
Ranking and Selecting Features Using an Adaptive Multiple Feature Subset Method
Fu Chang and Chan-Cheng Liu
標題 / 作者 / 摘要
A Novel Approach for Efficient Big Data Broadcasting
Chi-Jen Wu, Chin-Fu Ku, Jan-Ming Ho and Ming-Syan Chen
Big-Data Computing is a new critical challenge for the ICT industry. Engineers and researchers are dealing with data sets of petabyte scale in the cloud computing paradigm. Thus the demand for building a service stack to distribute, manage and process massive data sets has risen drastically. In this paper, we investigate the Big Data Broadcasting problem for a single source node to broadcast a big chunk of data to a set of nodes with the objective of minimizing the maximum completion time. These nodes may locate in the same datacenter or across geo-distributed datacenters. This problem is one of the fundamental problems in distributed computing and is known to be NP-hard in heterogeneous environments. We model the Big-data broadcasting problem into a LockStep Broadcast Tree (LSBT) problem.
The main idea of the LSBT model is to define a basic unit of upload bandwidth, r, such that a node with capacity c broadcasts data to a set of ?c=r? children at the rate r. Note that r is a parameter to be optimized as part of the LSBT problem. We further divide the broadcast data into m chunks. These data chunks can then be broadcast down the LSBT in a pipeline manner. In a homogeneous network environment in which each node has the same upload capacity c, we show that the optimal uplink rate r of LSBT is either c=2 or c=3, whichever gives the smaller maximum completion time. For heterogeneous environments, we present an O(nlog2n) algorithm to select an optimal uplink rate r and to construct an optimal LSBT. Numerical results show that our approach performs well with less maximum completion time and lower computational complexity than other efficient solutions in literature.
標題 / 作者 / 摘要
A Framework for Fusion of Symbiotic Human Sensor and Physical Sensor Data
J.W. S. Liu, E. T.-H. Chu, and P. H. Tsai