• Media type: E-Book; Thesis
  • Title: Invariant Features For Time-Series Classification
  • Contributor: Grabocka, Josif [Author]; Baydogan, Mustafa [Other]; Schmidt-Thieme, Lars [Degree supervisor]
  • Published: 2016
  • Extent: 1 Online-Ressource
  • Language: English
  • Identifier:
  • Keywords: Hochschulschrift
  • Origination:
  • University thesis: Dissertation, Universität Hildesheim, 2016
  • Footnote:
  • Description: Time series represent the most widely spread type of data, occurring in a myriad of application domains, ranging from physiological sensors up to astronomical light intensities. The classification of time-series is one of the most prominent challenges, which utilizes a recorded set of expert-labeled time-series, in order to automatically predict the label of future series without the need of an expert.The patterns of time-series are often shifted in time, have different scales, contain arbitrarily repeating patterns and exhibit local distortions/noise. In other cases, the differences among classes are attributed to small local segments, rather than the global structure. For those reasons, values corresponding to a particular time-stamp have different semantics on different time-series. We call this phenomena as intra-class variations. The lion's share of this thesis is composed of presenting new methods that can accurately classify time-series instances, by handling variations. The answer towards resolving the bottlenecks of intra-class variations relies on not using the time-series values as direct features. Instead, the approach of this thesis is to extract a set of features that, on one hand, represent all the variations of the data and, on the other hand, can boost classification accuracy. In other words, this thesis proposes a list of methods that addresses diverse aspects of intra-class variations. The first proposed approach is to generate new training instances, by transforming the support vectors of an SVM. The second approach decomposes time-series through a segment-wise convolutional factorization. The strategy involves learning a set of patterns and weights, whose product can approximate each sub-sequence of the time series. However, the main contribution of the thesis is the third approach, called shapelet learning, which utilizes the training labels during the learning process, i.e. the process is supervised. Since the features are learned on the training labels, there is a higher tendency of performing strongly in terms of predicting the testing labels. In addition, we present a fast alternative method for shapelet discovery. Our strategy is to prune segment candidates using a two step approach. First of all, we prune candidates based on their similarity towards previously considered candidates. Secondly, non-similar (hence diverse) candidates are selected only if the features they produce improve the classification results. The last two chapters of the thesis describes two methods that extract features from datasets having special characteristics. More concretely, we propose a classification method suited for series having missing values, as well as a method that extract features from time series having repetitive patterns.
  • Access State: Open Access