• Medientyp: Sonstige Veröffentlichung; Dissertation; Elektronische Hochschulschrift; E-Book
  • Titel: Column subset selection with applications to neuroimaging data
  • Beteiligte: Strauch, Martin [Verfasser:in]
  • Erschienen: KOPS - The Institutional Repository of the University of Konstanz, 2014
  • Sprache: Englisch
  • Schlagwörter: column-based matrix factorisation ; neuroimaging data ; column subset selection
  • Entstehung:
  • Anmerkungen: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Beschreibung: Column (subset) selection problems require to select a subset of the columns from a matrix in an unsupervised fashion and such that a matrix norm error criterion is minimised. Motivations for column selection are 1) data interpretation through identifying relevant columns, 2) speedup by performing computationally expensive operations on a small column subset. This work introduces structural and algorithmic improvements regarding both aspects, along with demonstrating applications to neuroimaging data. Column selection for data interpretation: NNCX and Convex_cone For CX factorisation (Drineas et al., SIAM J. Matrix Analysis and Applications, 2008), a $c$-subset of the columns of matrix $A$ ($m \times n$) should be selected into the $m \times c$ matrix $C$, and combined linearly with coefficients in $X$ ($c \times n$), such that the CX norm error $\left\| A - CX \right\|^2_{Fr}$ is minimised in the Frobenius norm. For non-negative CX (NNCX), the coefficients in $X$ are constrained to be non-negative, which has advantages with respect to data interpretation (Hyvönen et al., ACM SIGKDD, 2008). The goal is to find good column selection strategies for NNCX, and to analyse the interpretability aspect of CX/NNCX column selection. To this end, a generative model for NNCX is introduced, where the columns of $A$ contain either one of $s$ generating pure signal columns or a linear combination (with non-negative coefficients) of several pure signal columns. An algorithm, Convex_cone, is proposed as a heuristic for selecting the extreme columns of $A$. These extreme columns correspond to the generating columns and they span a convex cone that contains the data points of $A$. The extreme columns are interpretable in the sense that they allow to understand how $A$ has been constructed, and they also serve to reduce the NNCX norm error $\left\| A - CX^{0+} \right\|^2_{Fr}$ (non-negativity indicated by $^{0+}$). Empirical evaluation is performed against state-of-the-art algorithms for column selection. With respect to recovering ...
  • Zugangsstatus: Freier Zugang
  • Rechte-/Nutzungshinweise: Urheberrechtsschutz