學術演講

Bioinformatics for DNA-seq and RNA-seq experiments

講者Li-San Wang 教授 (Institute for Biomedical Informatics, Genomics and Computational Biology Graduate Group, University of Pennsylvania Perelman School of Medicine)
邀請人：蔡懷寬
時間2013-08-07 (Wed.) 10:30 ~ 12:00
地點資訊所新館106演講廳

摘要

Although only introduced very rcently, next generation sequencing technology (NGS) has brought revolution to every aspect of molecular biology. They also pose new challenges to computational biology and bioinformatics: how to analyze and store data at peta-byte level, how to develop algorithms that fully integrate with the sequencing protocol, and how to interpret the findings.

This talk is an overview of two workflows on NGS analysis developed by our lab. Both workflows are available free from our lab website, http://wanglab.pcbi.upenn.edu/content/downloads.

DRAW (DNA Resequencing Analysis Workflow) implements a standard analysis pipeline for whole-genome and whole-exome sequencing (WGS/WES) experiments. A 350Gbp pair-end WES flowcell can be uploaded and fully analyzed in two days with 110 cores on Amazon Elastic Compute Cloud (EC2). DRAW was used to analyze part of the WES data for a multi-institutional autism study (Neale et al., Nature 2012) and more than 500 exomes/genomes for human and C. elegans in our laboratories.

CoRAL (Classification of RNAs by Analysis of Length; Leung et al., NAR 2013) generates biologically interpretable features including fragment length, cleavage specificity, and antisense transcription from small RNA-seq experiments to distinguish between different ncRNA classes. We evaluated CoRAL using genome-wide small RNA sequencing (smRNA-seq) datasets from four human tissue types (brain, skin, liver, and serum), and were able to classify six different types of RNA transcripts with ~80% accuracy in cross-validation experiments, and with 71~73% accuracy when CoRAL uses one tissue type for training and the other as validation.

中央研究院資訊科學研究所

活動訊息

學術演講

Bioinformatics for DNA-seq and RNA-seq experiments

摘要