• Medientyp: Dissertation; E-Book; Elektronische Hochschulschrift
  • Titel: Mixture models, theory with application to species richness, clustering, classification and all that jazz
  • Beteiligte: Kulagina, Yulia [VerfasserIn]
  • Erschienen: ETH Zurich, 2022
  • Sprache: Englisch
  • DOI: https://doi.org/20.500.11850/581661; https://doi.org/10.3929/ethz-b-000581661
  • Schlagwörter: Mathematics
  • Entstehung:
  • Anmerkungen: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Beschreibung: Mixture models occur in numerous settings including random and fixed effects models, clustering, deconvolution, empirical Bayes problems and many others. They are often used to model data originating from a heterogeneous population, consisting of several homogeneous subpopulations, thus the problem of finding a good estimator for the number of components in the mixture arises naturally. Estimation of the order of a finite mixture model is a hard statistical task, and multiple techniques have been suggested for solving it. In this thesis we concentrate on several such methods that have not gained much popularity but are nonetheless interesting from the theoretical viewpoint as well as deserve the attention of practitioners. The said methods can be categorized into three groups: tools built upon the determinant of the Hankel matrix of moments of the mixing distribution, minimum distance estimators, likelihood ratio tests. One of the valuable features of all of these approaches is that they all come with theoretical guarantees for consistency. We address theoretical pillars underlying each of the statistical techniques and present the results of the comparative numerical study that has been conducted under various scenarios. In addition to the above mentioned methods we have also added the results of the neural-network-based approach. According to the results, none of the methods proves to be a "magic pill". The results uncover limitations of the techniques and provide practical hints for choosing the best-suited tool under specific conditions. After discussing the relevant theory and analysing some simulation results, we introduce the software that allows for convenient and flexible implementation of the discussed methods for simulated data as well as for real datasets whenever the data in hand is univariate. We also demonstrate the performance of the studied techniques on real world data. We further discuss the feasibility of extensions of some of these methods to the multidimensional setting and present some ...
  • Zugangsstatus: Freier Zugang