• Medientyp: E-Book; Hochschulschrift
  • Titel: Efficient duplicate detection and the impact of transitivity
  • Weitere Titel: Übersetzung des Haupttitels: Effiziente Dublettenerkennung und der Einfluss von Transitivität
  • Beteiligte: Draisbach, Uwe [VerfasserIn]; Naumann, Felix [AkademischeR BetreuerIn]; Deßloch, Stefan [AkademischeR BetreuerIn]; Conrad, Stefan [AkademischeR BetreuerIn]
  • Körperschaft: Universität Potsdam
  • Erschienen: Potsdam, [2022?]
  • Umfang: 1 Online-Ressource (x, 150 Seiten, 5650 KB); Illustrationen, Diagramme
  • Sprache: Englisch
  • DOI: 10.25932/publishup-57214
  • Identifikator:
  • Schlagwörter: Hochschulschrift
  • Entstehung:
  • Hochschulschrift: Dissertation, Universität Potsdam, 2022
  • Anmerkungen:
  • Beschreibung: Duplicate detection describes the process of finding multiple representations of the same real-world entity in the absence of a unique identifier, and has many application areas, such as customer relationship management, genealogy and social sciences, or online shopping. Due to the increasing amount of data in recent years, the problem has become even more challenging on the one hand, but has led to a renaissance in duplicate detection research on the other hand. This thesis examines the effects and opportunities of transitive relationships on the duplicate detection process. Transitivity implies that if record pairs ⟨ri,rj⟩ and ⟨rj,rk⟩ are classified as duplicates, then also record pair ⟨ri,rk⟩ has to be a duplicate. However, this reasoning might contradict with the pairwise classification, which is usually based on the similarity of objects. An essential property of similarity, in contrast to equivalence, is that similarity is not necessarily transitive. First, we experimentally evaluate the effect of an increasing data volume on ...
  • Zugangsstatus: Freier Zugang