• Media type: Electronic Thesis; E-Book; Doctoral Thesis
  • Title: Deep Learning-based 3D Hand Pose and Shape Estimation from a Single Depth Image: Methods, Datasets and Application
  • Contributor: Malik, Muhammad Jameel Nawaz [Author]
  • imprint: KLUEDO - Publication Server of University of Kaiserslautern-Landau (RPTU), 2020
  • Language: English
  • Keywords: hand shape ; convolutional neural networks ; hand pose ; depth image
  • Origination:
  • Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Description: 3D hand pose and shape estimation from a single depth image is a challenging computer vision and graphics problem with many applications such as human computer interaction and animation of a personalized hand shape in augmented reality (AR). This problem is challenging due to several factors for instance high degrees of freedom, view-point variations and varying hand shapes. Hybrid approaches based on deep learning followed by model fitting preserve the structure of hand. However, a pre-calibrated hand model limits the generalization of these approaches. To address this limitation, we proposed a novel hybrid algorithm for simultaneous estimation of 3D hand pose and bone-lengths of a hand model which allows training on datasets that contain varying hand shapes. On the other hand, direct joint regression methods achieve high accuracy but they do not incorporate the structure of hand in the learning process. Therefore, we introduced a novel structure-aware algorithm which learns to estimate 3D hand pose jointly with new structural constraints. These constraints include fingers lengths, distances of joints along the kinematic chain and fingers inter-distances. Learning these constraints help to maintain a structural relation between the estimated joint keypoints. Previous methods addressed the problem of 3D hand pose estimation. We open a new research topic and proposed the first deep network which jointly estimates 3D hand shape and pose from a single depth image. Manually annotating real data for shape is laborious and sub-optimal. Hence, we created a million-scale synthetic dataset with accurate joint annotations and mesh files of depth maps. However, the performance of this deep network is restricted by limited representation capacity of the hand model. Therefore, we proposed a novel regression-based approach in which the 3D dense hand mesh is recovered from sparse 3D hand pose, and weak-supervision is provided by a depth image synthesizer. The above mentioned approaches regressed 3D hand meshes from 2D depth ...
  • Access State: Open Access