NMR data from different
experiments often contain errors; thus, automated
backbone resonance assignment is a very challenging
issue.
In this paper, we present a method called GANA
that uses a genetic algorithm to automatically
perform backbone resonance assignment with a
high degree of precision and recall. Precision
is the number of correctly assigned residues
divided by the number of assigned residues,
and recall is the number of correctly assigned
residues divided by the number of residues with
known human curated answers.
GANA takes spin systems as input data and uses
two data structures, candidate lists and adjacency
lists, to assign the spin systems to each amino
acid of a target protein. Using GANA, almost
all spin systems can be mapped correctly onto
a target protein, even if the data are noisy.
We use the BioMagResBank (BMRB) dataset (901
proteins) to test the performance of GANA. To
evaluate the robustness of GANA, we generate
four additional datasets from the BMRB dataset
to simulate data errors of false positives,
false negatives and linking errors. We also
use a combination of these three error types
to examine the fault tolerance of our method.
The average precision rates of GANA on BMRB
and the four simulated test cases are 99.61,
99.55, 99.34, 99.35 and 98.60%, respectively.
The average recall rates of GANA on BMRB and
the four simulated test cases are 99.26, 99.19,
98.85, 98.87 and 97.78%, respectively. We also
test GANA on two real wet-lab datasets, hbSBD
and hbLBD. The precision and recall rates of
GANA on hbSBD are 95.12 and 92.86%, respectively,
and those of hbLBD are 100 and 97.40%, respectively.
Demo Site URL: http://bioinformatics.iis.sinica.edu.tw/GANA/
|