Sturm, Marc
[Author];
Quinten, Sascha
[Author];
Huber, Christian G.
[Author];
Kohlbacher, Oliver
[Author]
;
Marc Sturm and Sascha Quinten and Christian G. Huber and Oliver Kohlbacher
[Contributor]
A machine learning approach for prediction of DNA and peptide HPLC retention times
Footnote:
Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
Description:
High performance liquid chromatography (HPLC) has become one of the most efficient methods for the separation of biomolecules. It is an important tool in DNA purification after synthesis as well as DNA quantification. In both cases the separability of different oligonucleotides is essential. The prediction of oligonucleotide retention times prior to the experiment may detect superimposed nucleotides and thereby help to avoid futile experiments. In 2002 Gilar et al. proposed a simple mathematical model for the prediction of DNA retention times, that reliably works at high temperatures only (at least 70°C). To cover a wider temperature rang we incorporated DNA secondary structure information in addition to base composition and length. We used support vector regression (SVR) for the model generation and retention time prediction. A similar problem arises in shotgun proteomics. Here HPLC coupled to a mass spectrometer (MS) is used to analyze complex peptide mixtures (thousands of peptides). Predicting peptide retention times can be used to validate tandem-MS peptide identifications made by search engines like SEQUEST. Recently several methods including multiple linear regression and artificial neural networks were proposed, but SVR has not been used so far.