• Media type: Doctoral Thesis; Electronic Thesis; E-Book
  • Title: Analyzing the user´s state in HCI: from crisp emotions to conversational dispositions
  • Contributor: Scherer, Stefan [Author]
  • Published: Universität Ulm, 2016-03-15T06:23:00Z
  • Language: English
  • DOI: https://doi.org/10.18725/OPARU-1766
  • Keywords: Human-computer interaction ; Mustererkennung ; Affective Computing ; Man-machine systems ; Pattern recognition ; DDC 004 / Data processing & computer science ; Social signal processing ; Maschinelles Lernen
  • Origination:
  • Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Description: Human computer interaction (HCI) usually takes place on a crude question answer level. Human human interaction, however, is multifaceted, consisting of manifold interactive feedback loops between interlocutors, comprising social components, moods, feelings, personal goals, nonverbal and paralinguistic conversation channels and the like. In order to be able to establish a functioning communication between the interlocutors it is crucial to correctly and efficiently interpret these communication elements. This thesis seeks to contribute to the transfer of these purely human capabilities to the HCI domain. To be able to recognize the user´s latent state in natural HCI, it is crucial to acquire multimodal data comprising typical situations of HCI. The PIT corpus recordings, which are part of this thesis, let to the development and analysis of a novel hierarchical annotation scheme, enabling the annotator to assign categories in layers with different time scales. The categories comprise, apart from directly observable behavior, subjective user state labels specifically tailored for HCI. The analysis of this annotation paradigm led to several statistically significant correlations between the low level observations and the subjective user state annotations. This finding in turn supports a bottom-up approach combining basic building blocks to infer the user"s state. Following this strategy, several of the categories of lower levels, such as laughter and voice quality, were analyzed in automatic classification and detection experiments. Beside standard approaches for multimodal sequence analysis, newly developed classifiers incorporating uncertain information were investigated. These classifiers process features identified as relevant in preceding basic experiments, as well as features developed by the author in the course of this thesis.