Content Enrichment of Digital Libraries: Methods, Technologies and Implementations

Media type: Text; Doctoral Thesis; Electronic Thesis; E-Book

Title: Content Enrichment of Digital Libraries: Methods, Technologies and Implementations

Contributor: Hajra, Arben [Author]

Published: MACAU: Open Access Repository of Kiel University, 2020

Language: English

Keywords: VIAF ; SKOS ; thesaurus ; semantic web ; linked data ; word embedding ; deep learning ; persistent identifiers ; PID ; author disambiguation ; wikidata ; digital libraries ; information retrieval ; thesis ; recommender systems ; authority files ; machine learning ; LOD

Origination:

Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.

Description: Parallel to the establishment of the concept of a "digital library", there have been rapid developments in the fields of semantic technologies, information retrieval and artificial intelligence. The idea is to use make use of these three fields to crosslink bibliographic data, i.e., library content, and to enrich it "intelligently" with additional, especially non-library, information. By linking the contents of a library, it is possible to offer users access to semantically similar contents of different digital libraries. For instance, a list of semantically similar publications from completely different subject areas and from different digital libraries can be made accessible. In addition, the user is able to see a wider profile about authors, enriched with information such as biographical details, name alternatives, images, job titles, institute affiliations, etc. This information comes from a wide variety of sources, most of which are not library sources. In order to make such scenarios a reality, this dissertation follows two approaches. The first approach is about crosslinking digital library content in order to offer semantically similar publications based on additional information for a publication. Hence, this approach uses publication-related metadata as a basis. The aligned terms between linked open data repositories/thesauri are considered as an important starting point by considering narrower, broader and related concepts through semantic data models such as SKOS. Information retrieval methods are applied to identify publications with high semantic similarity. For this purpose, approaches of vector space models and "word embedding" are applied and analyzed comparatively. The analyses are performed in digital libraries with different thematic focuses (e.g. economy and agriculture). Using machine learning techniques, metadata is enriched, e.g. with synonyms for content keywords, in order to further improve similarity calculations. To ensure quality, the proposed approaches will be analyzed ...

Access State: Open Access

Rights information: In Copyright

Search in field:

Recently searched for: