Browse our site
About
People
Research Areas
Projects
Publications
Books
Book chapters
Journal articles
In proceedings
M. Sc. Dissertations
Ph. D. Dissertations
Technical reports
Seminars
News
You are here:
Home
Publications
View
Publication details
Go back
Publication details
Main information
Title:
Mining Protein Structure Data
Publication date:
December 2007
Citation:
SaBK07
Abstract:
This paper describes the application of machine learning algorithms to the discovery of knowledge in a protein structure database. The problem addressed is the determination of the solvent exposure of each amino acid residue, using different levels of exposed surface to define exposure. First we introduce the baseline classifier which achieves good prediction results despite only taking into account the amino acid type. Then we explain how we gathered and processed the data and built our classifier to improve the baseline prediction. Finally we test and compare several classifiers (e.g. Neural Networks, C5.0, CART and Chaid), and parameters (level of information per amino acid, SCOP class of protein, sliding window from the current amino acid) that might influence the prediction accuracy. We conclude by showing our models present a modest but statistically significant improvement over the baseline classifier’s accuracy.
In proceedings
Authors:
José Carlos Almeida Santos
,
Pedro Barahona
,
Ludwig Krippahl
Editors:
José Neves, Manuel Filipe Santos, José Machado
Book title:
New Trends in Artificial Intelligence, Proceddings of EPIA'07, 13th Portuguese Conference on Artificial Intelligence
Series:
-
Publisher:
Universidade do Minho
Address:
-
Volume:
-
Pages:
527-540
ISBN:
9789899561809
ISSN:
-
Note:
-
Url address:
-
Publication files
File #1:
- click here to download -
pdf 85 KB
Export formats
Plain text:
José Carlos Almeida Santos and Pedro Barahona and Ludwig Krippahl, Mining Protein Structure Data, in: José Neves and Manuel Filipe Santos and José Machado (eds), New Trends in Artificial Intelligence, Proceddings of EPIA'07, 13th Portuguese Conference on Artificial Intelligence, Universidade do Minho, ISBN 9789899561809, Pag. 527-540, December 2007.
HTML:
<a href="/people/members/view.php?code=f25a84f61f1d34c505d65cacd90eef64" class="author">José Carlos Almeida Santos</a>, <a href="/people/members/view.php?code=7e27bc13fad97e99cd21ea6914d55659" class="author">Pedro Barahona</a> and <a href="/people/members/view.php?code=195d68ea5904b58472fd8c8aedcae233" class="author">Ludwig Krippahl</a>, <b>Mining Protein Structure Data</b>, in: José Neves, Manuel Filipe Santos and José Machado (eds), <u>New Trends in Artificial Intelligence, Proceddings of EPIA'07, 13th Portuguese Conference on Artificial Intelligence</u>, Universidade do Minho, ISBN 9789899561809, Pag. 527-540, December 2007.
BibTeX:
@inproceedings {SaBK07, author = {Jos{\'e} Carlos Almeida Santos and Pedro Barahona and Ludwig Krippahl}, editor = {Jos{\'e} Neves and Manuel Filipe Santos and Jos{\'e} Machado}, title = {Mining Protein Structure Data}, booktitle = {New Trends in Artificial Intelligence, Proceddings of EPIA'07, 13th Portuguese Conference on Artificial Intelligence}, publisher = {Universidade do Minho}, pages = {527-540}, isbn = {9789899561809}, abstract = {This paper describes the application of machine learning algorithms to the discovery of knowledge in a protein structure database. The problem addressed is the determination of the solvent exposure of each amino acid residue, using different levels of exposed surface to define exposure. First we introduce the baseline classifier which achieves good prediction results despite only taking into account the amino acid type. Then we explain how we gathered and processed the data and built our classifier to improve the baseline prediction. Finally we test and compare several classifiers (e.g. Neural Networks, C5.0, CART and Chaid), and parameters (level of information per amino acid, SCOP class of protein, sliding window from the current amino acid) that might influence the prediction accuracy. We conclude by showing our models present a modest but statistically significant improvement over the baseline classifier’s accuracy.}, month = {December}, year = {2007}, }
Publication's urls
Full url:
/publications/view.php?code=d999b1d65240c68f70d3cf33caa42793
Friendly url:
/publications/view.php?code=SaBK07
Go back
Departamento de Informática, FCT/UNL
Quinta da Torre 2829-516 CAPARICA - Portugal
Tel. (+351) 21 294 8536 FAX (+351) 21 294 8541