Institute of Information Science Academia Sinica
講 題: TIGP (BIO) -- Handling the heterogeneity in genomic datasets
講 者: 魏穎穎 博士 (Department of Statistics, The Chinese University of Hong Kong)
時 間: 2017-08-16 (Wed) 10:00 – 11:30
地 點: 資訊所新館101演講廳
邀請人: TIGP Bioinformatics Program
摘要:

High-throughput experimental data are accumulating exponentially in public databases. Unfortunately, however, mining valid scientific discoveries from these abundant resources is hampered by technical artifacts and inherent biological heterogeneity. Ignoring heterogeneity would lead to not only low statistical power but also often misleading scientific conclusions. In this talk, I will present two examples as illustration. In the first part, we propose a novel Bayesian hierarchical model to correct batch effects when sample groupings are unknown. We prove the model identifiability and provide conditions for study designs under which batch effects can be corrected. Application of the proposed model to a real breast cancer dataset combined from three bathes measured on two platforms offer much better biological insights compared to existing methods. In the second part, I will discuss transcription factor (TF) networks. TF networks are dynamic over diverse biological conditions and heterogeneous across the genome within each biological condition. We propose a Bayesian nonparametric dynamic Poisson graphical model for legitimate inference on heterogeneous TF networks. We develop an efficient parallel Markov Chain Monte Carlo algorithm for posterior computation and study TF associations in ENCODE cell lines.