Description:
Based on a given time series, the data-driven Langevin equation (dLE) estimates the drift and the diffusion field of the dynamics, which are then employed to reproduce the essential statistical and dynamical features of the original time series. Because the propagation of the dLE requires only local information, the input data are neither required to be Boltzmann weighted nor to be a continuous trajectory. Similar to a Markov state model, the dLE approach therefore holds the promise of predicting the long-time dynamics of a biomolecular system from relatively short trajectories which can be run in parallel. The practical applicability of the approach is shown to be mainly limited by the initial sampling of the system’s conformational space obtained from the short trajectories. Adopting extensive molecular dynamics simulations of the unfolding and refolding of a short peptide helix, it is shown that the dLE approach is able to describe microsecond conformational dynamics from a few hundred nanosecond trajectories. In particular, the dLE quantitatively reproduces the free energy landscape and the associated conformational dynamics along the chosen five-dimensional reaction coordinate.