• Media type: E-Book
  • Title: Developing a workflow management system for fragment-based virtual screening
  • Contributor: Bray, Simon A. [Verfasser]; Backofen, Rolf [Akademischer Betreuer]; Backofen, Rolf [Reviewer]; Wolf, Steffen [Reviewer]
  • Corporation: Albert-Ludwigs-Universität Freiburg, Fakultät für Angewandte Wissenschaften ; Albert-Ludwigs-Universität Freiburg, Fakultät für Angewandte Wissenschaften
  • imprint: Freiburg: Universität, 2023
  • Extent: Online-Ressource
  • Language: English
  • DOI: 10.6094/UNIFR/240206
  • Identifier:
  • Keywords: Proteine ; Computational chemistry ; Computerunterstütztes Verfahren ; Computersimulation ; Bioinformatik ; Biophysik ; (local)doctoralThesis
  • Origination:
  • University thesis: Dissertation, Universität Freiburg, 2023
  • Footnote:
  • Description: Abstract: Drug development is a long, complex and expensive process. In particular, the first step of obtaining an initial list of drug candidates is challenging. Experimental screening, for example using protein-ligand binding assays, is fundamentally limited, and as a result, the concept of virtual screening comes into play. Virtual screening involves the use of in silico experiments such as statistical analyses, protein-ligand docking, and free energy calculations based on molecular dynamics (MD) simulation, in order to predict whether a particular compound is likely to bind to a particular target protein. Often, an initial list of candidates is generated by a fragment- approach, where a fragment is a small organic compound that can serve as a substructure for a putative drug candidate. Fragments can be found in either an experimental or theoretical manner, and can then be combined, or amended by the addition of other functional groups, in order to produce a list of candidate molecules.<br><br>There is then a need to determine the likelihood that these candidates bind to the target protein. There are several computer-based methods that can be of service in this task; these methods are not mutually exclusive, but on the contrary are typically used sequentially as well as in parallel. However, they require different amounts and types of computational resources, and careful planning is therefore required to manage resources, organise the software tools as complete workflows, and then to deploy them. To organise and perform the analysis, the scientist can use a workflow management system. Such systems allow multiple tools to be concatenated into a single pipeline, which can then be can be executed via the command line or a graphical interface. This has the advantage of being more convenient than the tedious execution of individual tools one after the other and helps avoid any manual errors. For highly complex analyses that require several different software tools with stepwise repetition, such as MD simulations for hundreds of ligands against a single target protein, the use of a workflow management system is the only viable option. Another challenge in virtual screening is reproducibility. In a reproducible scientific work, other scientists must be able to critically evaluate the work by performing the same experiments or simulations themselves and thus verifying the results. The issue of reproducibility has received much attention recently, including in the field of computational chemistry and virtual screening. The use of a workflow management ix system helps to increase the reproducibility of a study, because the details of all tools run, with parameters and all versions of the tool software, are recorded to make the analyses repeatable for other scientists who want to verify their work.<br><br>The focus of this work was to develop a platform for fragment-based virtual screening based on the Galaxy workflow management system. This platform can be used either through a graphical web-based interface or through the command-line - the latter is a useful alternative for complex simulations or analyses that may require additional scripting. In order to make the use of the command line easier, significant contributions were made to Planemo and BioBlend, two Python libraries that allow direct access to Galaxy via the Application programming Interface (API). In order to demonstrate the utility of the platform developed, two projects were carried out using the developed tools and workflows.<br><br>First, a study was performed on the T4 lysozyme mutant L99A in complex with benzene using the dcTMD technique as a model system for fragment-protein binding. T4L-L99A is a commonly used model system for free energy calculations, and is especially useful as a model for fragment binding, due to the small size of the pocket and the benzene ligand, which is typical for the compounds and pockets generally used in fragment-based screening studies, and the fact that benzene binds rather weakly. Like many MD methods, dcTMD requires the execution of a large number of steps in sequence, and requires the creation of an ensemble of simulations, both features which benefit from the use of a workflow management system. The analysis was able to uncover multiple unbinding pathways, an essential feature of the dcTMD method, and to characterise the thermodynamics and kinetics of several of these. The final results were comparable to experimental benchmarks.<br><br>Second, a virtual screening was performed with the aim of identifying effective inhibitors of the major protease of the SARS-CoV virus; 53,000 compounds were generated based on 22 non-covalent crystallographic fragments, and their binding ability was analysed sequentially by protein-ligand docking, MMGBSA calculations and dcTMD simulations. Several million docking poses were generated, and scored by experimental validation against the crystallographic fragment structures. Over 200 compounds were then assessed by MMGBSA, followed by a further filtering and execution of a dcTMD workflow for 50 compounds. One fragment, which enforces a conformational change on the protein binding site, was found to confer particularly strong binding ability on derived compounds, and it was shown that particular interactions correlated especially strongly with both MMGBSA and dcTMD scores
  • Access State: Open Access