Page 38 - untitled
P. 38
labeling, called Multi-Q. This tool is designed as a ously established fragment library and by further Entropy (ME) model and Conditional Random will construct the eukaryotic protein-protein interac-
generic platform that can accommodate various in- developing new algorithms to dissect the sequence- Fields (CRF) as the underlying machine learning tion network from recent high though-put interac-
put data formats from different mass spectrometers structure relationship. We have successfully dem- methods, and incorporate dictionary-based and rule- tome studies for various species. All the interactions
and search engines. This work is in collaboration onstrated that our fragment library provides a good based methods as post-processing of ME to enhance will be converted into domain-domain interactions
with the Institute of Chemistry. According to the basis set of building blocks for reconstructing and the performance. Once named entities can be recog- and then the conserved network motifs will be ex-
chemists, our software is the most advanced tool predicting whole protein structures. However, the nized, we then aim to recognize relations between tracted to infer protein interactome related to human
available in the world. In addition, we are develop- exact nature of the relationship between a protein’ named entities. We collaborate with biologists to diseases. Using this model, we will build a powerful
ing a tool for ICAT labeling quantitation. In the s sequence and its structure remains one of the open work on the problems of recognizing protein-protein tool to discover unknown interacting protein pairs
future, we will adapt our tools to two-plex or mul- challenges in computational biology. To discover interaction relations and gene-disease relations. Re- with a probability score. According to the conserved Research Groups
tiplexed quantitative analysis using other isotopic the relationship of protein’s sequence and its struc- lated paper has appeared in BMC Bioinformatics network model with spatio-temporal information,
labeling strategies. Moreover, tools for visualization ture is quite important and worth our effort. 2006. the interactions between pathogens and human, and
to assist in the biological interpretation of the data the procession of carcinogenesis will be deciphered.
NMR backbone resonance assignment and NOE Genomic information retrieval
will also be developed. The critical target proteins in those networks will
experiment
The Intelligent Agent Systems Lab (IASL) be unrevealed by the topological analysis of protein
3. Structural bioinformatics
Research Groups
NMR spectroscopy is one of the popular ex- participated in the TREC 2005 Genomics Track network. The interaction network will provide po-
th
Protein structure prediction periments to determine protein structure. An im- Ad-hoc Retrieval Contest and won the 6 place out tential candidates for developing new therapeutic
portant stage of protein structure determination by of 32 teams. The Genomic information retrieval strategies for human cancer and infectious diseases.
We have developed a hybrid knowledge-based
using NMR is protein backbone resonance assign- contest combines natural language queries and table Objectives of this study are to improve our un-
protein secondary structure prediction algorithm,
ment. This is a tedious and time-consuming manual search. Due to the variations of biological terms and derstanding of the puzzle during the development
called HYPROSP II, which combines an existing
work. We have developed an iterative relaxation the large amount of unknown medical words, the stage, carcinogenesis and infectious mechanism,
machine learning approach, PSIPRED, and a new
technique for automatic backbone assignment that retrieval task is particularly difficult. The lab has ac- and furthermore to introduce a new paradigm for the
peptide knowledge based approach for prediction.
can tolerate a huge amount of noise in the data. Our cumulated many years of experiences in developing diagnosis and treatment of human disease to revolu-
The average prediction accuracy of HYPROSP is
paper was accepted in RECOMB 2005 and was in- information extraction, retrieval, natural language tionize current medical services delivered.
around 82%, which is better than both of PSIPRED
vited to be published in Journal of Computational processing and question answering systems, and
and the knowledge based approach. As more protein
Biology. It is the very first paper from Taiwan ac- obtained an accuracy of 24.53% (The best team
structures are determined, the knowledge base is
cepted by RECOMB since its inception nine years has 28%). The performance is very close to the top
expected to grow and the prediction accuracy is also
ago. A related result based on genetic algorithm five teams: York Univeristy, IBM、University of
expected to increase. Related papers appeared in
has appeared in Nucleic Acids Research 2005. To Waterloo、UIUC及National Library of Medicine
Nucleic Acids Research 2004 and Bioinformatics
extract geometric constraints for the structure calcu- (NLM). In the fi rst year’s work, IASL has only
2005. We have also adopted more biological domain
lations from the NMR spectra, we need to consider employed keyword expansion. In the future they
knowledge and machine learning techniques to pre-
NOEs and coupling constants that are transformed will adopt more biological knowledge to enhance
dict related structure problems, such as local struc-
into distance and dihedral angle constraints. We system performance.
ture, b-turn, transmembrane helix prediction, etc.
shall develop an efficient algorithm for NOE data
Once protein secondary structures can be predicted 5. Systems biology
analysis and use this data analysis result to improve
with improved accuracy, we then target to predict
backbone assignment. This research is in collabora- Network analysis of human protein interactions
tertiary structures with emphasis on the protein fold
tion with IBMS. for Tumorigenesis and infectious diseases using
recognition problem.
systems biology
4. Biomedical literature mining
Protein 3D structure prediction by fragment
Advances in molecular biology, analytical
assembly Biological term and relation extraction
and computational technologies are enabling us to
We propose to predict the protein backbone The Intelligent Agent System Lab has devel- investigate systematically on complicated molecu-
conformation based solely on the sequence informa- oped a system for biological named entities recogni- lar processes through protein interaction networks
tion. The objective will be achieved using our previ- tion from biomedical literature. We use Maximum underlying biological phenotypes. In this study, we
26 27