• Media type: Doctoral Thesis; E-Book; Electronic Thesis
  • Title: A combinatorial approach to RNA sequence-structure alignments ; Ein kombinatorischer Ansatz für RNA Sequenz-Struktur-Alignments
  • Contributor: Bauer, Markus [Author]
  • imprint: Freie Universität Berlin: Refubium (FU Berlin), 2008
  • Extent: 117 S.
  • Language: English
  • DOI: https://doi.org/10.17169/refubium-9019
  • Origination:
  • Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Description: Until a couple of years ago the scientific mainstream held that genetic information, stored as DNA strands, is transcribed to RNA, and RNA sequences are in turn translated to proteins, the actual functional units in the cell. RNA was generally believed to be a helper molecule in the cell until the beginning of the new millennium. This view changed. We see the potential of RNA as one of the key cellular players. In this thesis we present a novel framework for computing sequence-structure alignments of RNA sequences. Our contribution is twofold: first, we give a graph-theoretic model for the computation of multiple sequence-structure alignments. We phrase the model as an integer linear program (ILP) and show how we can relax the ILP such that we are able to compute optimal or near-optimal solutions for the original problem. In a subsequent step, we augment the initial model with stacking energies. Stacking base pairs greatly contribute to the energetic stability of the overall structure and should therefore be additionally rewarded. We extend the original ILP such that stacking energies are incorporated. Second, we give extensive computational results on real data from the RFAM database. We compare the performance of truly multiple sum-of-pairs sequence-structure alignments to heuristic sequence-structure alignments. We show that the objective function value of the sum-of-pairs model is generally higher compared to the heuristically inferred alignments. At the same time, we sketch the computational limits for the sum-of-pairs multiple sequence-structure model. The computational costs for computing exact multiple sequence-structure alignments are generally very high. To validate our approach on a larger test set, we run two implementations that take two sequences as their input. LaRA and SLaRA---based on the initial and the stack model---compute all pairwise sequence-structure alignments and use the external program TCOFFEE to infer a consistency-based multiple sequence-structure alignment. Additionally, we run the ...
  • Access State: Open Access