| Previous | [ 1] | [ 2] | [ 3] | [ 4] | [ 5] | [ 6] | [ 7] | [ 8] | [ 9] | [ 10] |
¡@
Chuen-Der Huang+, Sheng-Fu Liang, Chin-Teng Lin and Ruei-Cheng Wu
+Department of Electrical Engineering
Hsiuping Institute of Technology
Taichung, 412 Taiwan
Department of Electrical and Control Engineering
National Chiao Tung University
Hsinchu, 300 Taiwan
E-mail: ctlin@mail.nctu.edu.tw
In machine learning, both the properly used networks and the selected features are
important factors which should be considered carefully. These two factors will influence
the result, whether for better or worse. In bioinformatics, the amount of features may be
very large to make machine learning possible. In this study we introduce the idea of
feature selection in the problem of bioinformatics. We use neural networks to complete
our task where each input node is associated with a gate. At the beginning of the
training, all gates are almost closed, and, at this time, no features are allowed to enter the
network. During the training phase, gates are either opened or closed, depending on the
requirements. After the selection training phase has completed, gates corresponding to
the helpful features are completely opened while gates corresponding to the useless
features are closed more tightly. Some gates may be partially open, depending on the
importance of the corresponding features. So, the network can not only select features in
an online manner during learning, but it also does some feature extraction. We combine
feature selection with our novel hierarchical machine learning architecture and apply it
to multi-class protein fold classification. At the first level the network classifies the data
into four major folds: all alpha, all beta, alpha + beta and alpha/beta. In the next level,
we have another set of networks which further classifies the data into twenty-seven
folds. This approach helps achieve the following. The gating network is found to reduce
the number of features drastically. It is interesting to observe that, for the first level using
just 50 features selected by the gating network, we can get a test accuracy comparable to
that using 125 features in neural classifiers. The process also helps us get a better insight
into the folding process. For example, tracking the evolution of different gates, we can
find which characteristics (features) of the data are more important for the folding
process. Eventually, it reduces the computation time. The use of the hierarchical
architecture helps us get a better performance also.
Received June 3, 2003; revised March 24 & June 1, 2004; accepted July 15, 2004.
Communicated by Chuen-Tsai Sun.
*This work was supported in part by the Brain Research Center, University System of Taiwan, under Grant
92B-711.