TR-IIS-07-010   Fulltext


ON THE ACCURACY OF TRANSMEMBRANE HELIX PREDICTION METHODS USING AN UPDATED BENCHMARK

Allan Lo, Hua-Sheng Chiu, Ting-Yi Sung, Wen-Lian Hsu

 

Abstract

The prediction of transmembrane (TM) helix and topology is an important field of bioinformatics owing to the difficulties in obtaining high-resolution structures of membrane proteins. Many methods have been developed and several evaluations have compared the performance of individual methods using benchmarks from various sources. We present an analysis of a popular evaluation method by Kernytsky and Rost, which is created using data sets from more than six years ago. Our analysis shows that the benchmark contains data that have substantial disagreements in comparison with the current annotations in SwissProt Release 54.1. Furthermore, the benchmark also contains issues such as annotations of low reliability, sequence redundancy, and presence of signal peptides. We perform updating and cleansing of the above issues in the benchmark, and evaluate eleven widely used methods, including SVMtop, a hierarchical classification method based on support vector machines (SVM). The results show that SVMtop is ranked highly among the top-performing methods for helix prediction, correctly predicting the location of helices for more than 80% of the updated benchmark. Given the discrepancies and noises in the original benchmark, it should be used with discretion for assessing the performance of TM helix predictions. The analysis also implies that there is an urgent need for creating a new benchmark for an accurate and objective comparison. The updated benchmark is available for public use at http://bio-cluster.iis.sinica.edu.tw/~bioapp/SVMtop/dataset.htm.


Keywords
membrane protein, transmembrane helix, topology prediction, support vector machines, structure prediction