• Media type: Doctoral Thesis; Electronic Thesis; E-Book
  • Title: Learning Representations for Generative Modeling of Human Dynamics
  • Contributor: Aksan, Emre [Author]; id_orcid0 000-0002-9836-9011 [Author]
  • Published: ETH Zurich, 2022
  • Language: English
  • DOI: https://doi.org/20.500.11850/578473; https://doi.org/10.3929/ethz-b-000578473
  • Keywords: Data processing ; 3D motion analysis ; Neural networks ; 3D Human reconstruction ; Generative models ; Temporal modeling ; computer science
  • Origination:
  • Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Description: Humans possess a comprehensive set of interaction capabilities at various levels of abstraction including physical activities, verbal and non-verbal cues, and abstract communication skills to interact with the physical world, express ourselves, and communicate with others. In the quest of digitizing humans, we must seek answers to the problems of how to represent humans and how to establish human-like interactions on digital mediums. A critical issue is that human activities exhibit complex and rich dynamic behavior that is non-linear, time-varying, and context-dependent, which are quantities that are typically infeasible to rigorously define. In this thesis, we are primarily interested in modeling complex processes like how humans look, move, and communicate and in generating novel samples that are similar to the ones performed by humans. To do so, we propose using the deep generative modeling framework, which is capable of learning the underlying data generation process directly from observations. Over the course of this thesis, we showcase generative modeling strategies at various levels of abstraction and demonstrate how they can be used to model humans and synthesize plausible and realistic interactions. Specifically, we present three problems that are different in modality and complexity, yet related in terms of the modeling strategies. We first introduce the task of modeling free-form human actions like drawings and handwritten text. Our work focuses on personalization and generalization concepts by learning latent representations of writing style or drawing content. Second, we present the 3D human motion modeling task, where we aim to learn spatio-temporal representations to capture motion dynamics for both accurate short-term and plausible long-term motion predictions. Finally, we focus on learning an expressive representation space for the synthesis and animation of photo-realistic face avatars. Our proposed model is able to create a personalized 3D avatar from rich training data and animate it via ...
  • Access State: Open Access
  • Rights information: In Copyright - Non-commercial Use Permitted