Back to first pageBack to first page Centre for Artificial Intelligence of UNL
Browse our site
You are here:

Publication details

Publication details
Main information
Mining Protein Structure Data
December 2007
SaBK07
This paper describes the application of machine learning algorithms to the discovery of knowledge in a protein structure database. The problem addressed is the determination of the solvent exposure of each amino acid residue, using different levels of exposed surface to define exposure. First we introduce the baseline classifier which achieves good prediction results despite only taking into account the amino acid type. Then we explain how we gathered and processed the data and built our classifier to improve the baseline prediction. Finally we test and compare several classifiers (e.g. Neural Networks, C5.0, CART and Chaid), and parameters (level of information per amino acid, SCOP class of protein, sliding window from the current amino acid) that might influence the prediction accuracy. We conclude by showing our models present a modest but statistically significant improvement over the baseline classifier’s accuracy.
In proceedings
José Carlos Almeida Santos, Pedro Barahona, Ludwig Krippahl
José Neves, Manuel Filipe Santos, José Machado
New Trends in Artificial Intelligence, Proceddings of EPIA'07, 13th Portuguese Conference on Artificial Intelligence
-
Universidade do Minho
-
-
527-540
9789899561809
-
-
-
Publication files
- click here to download - pdf 85 KB
Export formats
José Carlos Almeida Santos and Pedro Barahona and Ludwig Krippahl, Mining Protein Structure Data, in: José Neves and Manuel Filipe Santos and José Machado (eds), New Trends in Artificial Intelligence, Proceddings of EPIA'07, 13th Portuguese Conference on Artificial Intelligence, Universidade do Minho, ISBN 9789899561809, Pag. 527-540, December 2007.
<a href="/people/members/view.php?code=f25a84f61f1d34c505d65cacd90eef64" class="author">José Carlos Almeida Santos</a>, <a href="/people/members/view.php?code=7e27bc13fad97e99cd21ea6914d55659" class="author">Pedro Barahona</a> and <a href="/people/members/view.php?code=195d68ea5904b58472fd8c8aedcae233" class="author">Ludwig Krippahl</a>, <b>Mining Protein Structure Data</b>, in: José Neves, Manuel Filipe Santos and José Machado (eds), <u>New Trends in Artificial Intelligence, Proceddings of EPIA'07, 13th Portuguese Conference on Artificial Intelligence</u>, Universidade do Minho, ISBN 9789899561809, Pag. 527-540, December 2007.
@inproceedings {SaBK07, author = {Jos{\'e} Carlos Almeida Santos and Pedro Barahona and Ludwig Krippahl}, editor = {Jos{\'e} Neves and Manuel Filipe Santos and Jos{\'e} Machado}, title = {Mining Protein Structure Data}, booktitle = {New Trends in Artificial Intelligence, Proceddings of EPIA'07, 13th Portuguese Conference on Artificial Intelligence}, publisher = {Universidade do Minho}, pages = {527-540}, isbn = {9789899561809}, abstract = {This paper describes the application of machine learning algorithms to the discovery of knowledge in a protein structure database. The problem addressed is the determination of the solvent exposure of each amino acid residue, using different levels of exposed surface to define exposure. First we introduce the baseline classifier which achieves good prediction results despite only taking into account the amino acid type. Then we explain how we gathered and processed the data and built our classifier to improve the baseline prediction. Finally we test and compare several classifiers (e.g. Neural Networks, C5.0, CART and Chaid), and parameters (level of information per amino acid, SCOP class of protein, sliding window from the current amino acid) that might influence the prediction accuracy. We conclude by showing our models present a modest but statistically significant improvement over the baseline classifier’s accuracy.}, month = {December}, year = {2007}, }
Publication's urls
/publications/view.php?code=d999b1d65240c68f70d3cf33caa42793
/publications/view.php?code=SaBK07

Centre for Artificial Intelligence of UNL
Departamento de Informática, FCT/UNL
Quinta da Torre 2829-516 CAPARICA - Portugal
Tel. (+351) 21 294 8536 FAX (+351) 21 294 8541

Fundacao_FCT