Previous [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

@

Journal of Information Science and Engineering, Vol.19 No.6, pp.923-942 (November 2003)


A Data Mining Method to Predict Transcriptional
Regulatory Sites Based on Differentially Expressed Genes
in Human Genome

Hsien-Da Huang1, Huei-Lin Chang4, Tsung-Shan Tsou3,
Baw-Jhiune Liu4 and Jorng-Tzong Horng1,2,*
1Department of Computer Science and Information Engineering
2Department of Life Science
3Institute of Statistics
National Central University
Chungli, 320 Taiwan
4Department of Computer Science and Engineering
Yuan-Ze University
Chungli, 320 Taiwan
*E-mail: horng@db.csie.ncu.edu.tw

Very large-scale gene expression analysis, i.e., UniGene and dbEST, is provided to find those genes with significantly differential expression in specific tissues. The differentially expressed genes in a specific tissue are potentially regulated concurrently by a combination of transcription factors. This study attempts to mine putative binding sites on how combinations of the known regulatory sites homologs and over-represented repetitive elements are distributed in the promoter regions of considered groups of differentially expressed genes. We propose a data mining approach to statistically discover the significantly tissue-specific combinations of known site homologs and over-represented repetitive sequences, which are distributed in the promoter regions of differentially gene groups. The association rules mined would facilitate to predict putative regulatory elements and identify genes potentially co-regulated by the putative regulatory elements.

Keywords: regulatory site, transcription factor, data mining, gene expression, UniGene, EST

Full Text () Retrieve PDF document (200311_02.pdf)

Received November 1, 2002; accepted June 5, 2003.
Communicated by Jenn-Kang Hwang.