• Media type: E-Article
  • Title: An approach for outlier and novelty detection for text data based on classifier confidence
  • Contributor: Pižurica, Nikola; Tomović, Savo
  • imprint: IOS Press, 2020
  • Published in: AI Communications
  • Language: Not determined
  • DOI: 10.3233/aic-200649
  • ISSN: 1875-8452; 0921-7126
  • Keywords: Artificial Intelligence
  • Origination:
  • Footnote:
  • Description: <jats:p>In this paper we present an approach for novelty detection in text data. The approach can also be considered as semi-supervised anomaly detection because it operates with the training dataset containing labelled instances for the known classes only. During the training phase the classification model is learned. It is assumed that at least two known classes exist in the available training dataset. In the testing phase instances are classified as normal or anomalous based on the classifier confidence. In other words, if the classifier cannot assign any of the known class labels to the given instance with sufficiently high confidence (probability), the instance will be declared as novelty (anomaly). We propose two procedures to objectively measure the classifier confidence. Experimental results show that the proposed approach is comparable to methods known in the literature.</jats:p>