Page 38 - untitled
P. 38

labeling, called Multi-Q. This tool is designed as a   ously established fragment library and by further   Entropy (ME) model and Conditional Random   will construct the eukaryotic protein-protein interac-
 generic platform that can accommodate various in-  developing new algorithms to dissect the sequence-  Fields (CRF) as the underlying machine learning   tion network from recent high though-put interac-
 put data formats from different mass spectrometers   structure relationship. We have successfully dem-  methods, and incorporate dictionary-based and rule-  tome studies for various species. All the interactions
 and search engines. This work is in collaboration   onstrated that our fragment library provides a good   based methods as post-processing of ME to enhance   will be converted into domain-domain interactions
 with the Institute of Chemistry. According to the   basis set of building blocks for reconstructing and   the performance. Once named entities can be recog-  and then the conserved network motifs will be ex-
 chemists, our software is the most advanced tool   predicting whole protein structures. However, the   nized, we then aim to recognize relations between   tracted to infer protein interactome related to human
 available in the world. In addition, we are develop-  exact nature of the relationship between a protein’  named entities. We collaborate with biologists to   diseases. Using this model, we will build a powerful
 ing a tool for ICAT labeling quantitation. In the   s sequence and its structure remains one of the open   work on the problems of recognizing protein-protein   tool to discover unknown interacting protein pairs
 future, we will adapt our tools to two-plex or mul-  challenges in computational biology. To discover   interaction relations and gene-disease relations. Re-  with a probability score. According to the conserved   Research Groups
 tiplexed quantitative analysis using other isotopic   the relationship of protein’s sequence and its struc-  lated paper has appeared in BMC Bioinformatics   network model with spatio-temporal information,
 labeling strategies. Moreover, tools for visualization   ture is quite important and worth our effort.  2006.  the interactions between pathogens and human, and
 to assist in the biological interpretation of the data         the procession of carcinogenesis will be deciphered.
 NMR backbone resonance assignment and NOE   Genomic information retrieval
 will also be developed.                                        The critical target proteins in those networks will
 experiment
                    The Intelligent Agent Systems Lab (IASL)    be unrevealed by the topological analysis of protein
 3. Structural bioinformatics
                                                                                                                  Research Groups
 NMR spectroscopy is one of the popular ex-  participated in the TREC 2005 Genomics Track   network. The interaction network will provide po-
                                                th
 Protein structure prediction  periments to determine protein structure.  An im-  Ad-hoc Retrieval Contest and won the 6  place out   tential candidates for developing new therapeutic
 portant stage of protein structure determination by   of 32 teams. The Genomic information retrieval   strategies for human cancer and infectious diseases.
 We have developed a hybrid knowledge-based
 using NMR is protein backbone resonance assign-  contest combines natural language queries and table   Objectives of this study are to improve our un-
 protein secondary structure prediction algorithm,
 ment. This is a tedious and time-consuming manual   search. Due to the variations of biological terms and   derstanding of the puzzle during the development
 called HYPROSP II, which combines an existing
 work.  We have developed an iterative relaxation   the large amount of unknown medical words, the   stage, carcinogenesis and infectious mechanism,
 machine learning approach, PSIPRED, and a new

 technique for automatic backbone assignment that   retrieval task is particularly difficult. The lab has ac-  and furthermore to introduce a new paradigm for the
 peptide knowledge based approach for prediction.
 can tolerate a huge amount of noise in the data. Our   cumulated many years of experiences in developing   diagnosis and treatment of human disease to revolu-
 The average prediction accuracy of HYPROSP is
 paper was accepted in RECOMB 2005 and was in-  information extraction, retrieval, natural language   tionize current medical services delivered.
 around 82%, which is better than both of PSIPRED
 vited to be published in Journal of Computational   processing and question answering systems, and
 and the knowledge based approach. As more protein
 Biology. It is the very first paper from Taiwan ac-  obtained an accuracy of 24.53% (The best team
 structures are determined, the knowledge base is
 cepted by RECOMB since its inception nine years   has 28%). The performance is very close to the top
 expected to grow and the prediction accuracy is also
 ago. A related result based on genetic algorithm   five teams: York Univeristy, IBM、University of
 expected to increase. Related papers appeared in
 has appeared in Nucleic Acids Research 2005. To   Waterloo、UIUC及National Library of Medicine
 Nucleic Acids Research 2004 and Bioinformatics
 extract geometric constraints for the structure calcu-  (NLM). In the fi rst year’s work, IASL has only
 2005. We have also adopted more biological domain
 lations from the NMR spectra, we need to consider   employed keyword expansion. In the future they
 knowledge and machine learning techniques to pre-
 NOEs and coupling constants that are transformed   will adopt more biological knowledge to enhance
 dict related structure problems, such as local struc-
 into distance and dihedral angle constraints. We   system performance.
 ture, b-turn, transmembrane helix prediction, etc.
 shall develop an efficient algorithm for NOE data
 Once protein secondary structures can be predicted   5. Systems biology
 analysis and use this data analysis result to improve
 with improved accuracy, we then target to predict
 backbone assignment. This research is in collabora-  Network analysis of human protein interactions
 tertiary structures with emphasis on the protein fold
 tion with IBMS.  for Tumorigenesis and infectious diseases using
 recognition problem.
               systems biology
 4. Biomedical literature mining
 Protein 3D structure prediction by fragment
                    Advances in molecular biology, analytical
 assembly  Biological term and relation extraction
               and computational technologies are enabling us to
 We propose to predict the protein backbone   The Intelligent Agent System Lab has devel-  investigate systematically on complicated molecu-
 conformation based solely on the sequence informa-  oped a system for biological named entities recogni-  lar processes through protein interaction networks
 tion. The objective will be achieved using our previ-  tion from biomedical literature. We use Maximum   underlying biological phenotypes. In this study, we





 26                                                                                                               27
   33   34   35   36   37   38   39   40   41   42   43