Retrieval, Crawling and Fusion of Entity-centric Data on the Web

Media type: E-Article; Text

Title: Retrieval, Crawling and Fusion of Entity-centric Data on the Web

Contributor: Dietze, Stefan [Author]; Calì, Andrea [Author]; Gorgan, Dorian [Author]; Ugarte, Martín [Author]

Published: Heidelberg : Springer Verlag, 2017

Issue: accepted Version

Language: English

DOI: https://doi.org/10.15488/1258; https://doi.org/10.1007/978-3-319-53640-8_1

ISSN: 0302-9743

Keywords: Konferenzschrift ; Semantics ; Web crawler ; Semantic Web ; Schema.org ; Dataset recommendation ; Knowledge based systems ; Knowledge graphs ; Data fusion ; Entity retrieval ; Arches ; Markup

Origination:

Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.

Description: While the Web of (entity-centric) data has seen tremendous growth over the past years, take-up and re-use is still limited. Data vary heavily with respect to their scale, quality, coverage or dynamics, what poses challenges for tasks such as entity retrieval or search. This chapter provides an overview of approaches to deal with the increasing heterogeneity of Web data. On the one hand, recommendation, linking, profiling and retrieval can provide efficient means to enable discovery and search of entity-centric data, specifically when dealing with traditional knowledge graphs and linked data. On the other hand, embedded markup such as Microdata and RDFa has emerged a novel, Web-scale source of entitycentric knowledge. While markup has seen increasing adoption over the last few years, driven by initiatives such as schema.org, it constitutes an increasingly important source of entity-centric data on the Web, being in the same order of magnitude as the Web itself with regards to dynamics and scale. To this end, markup data lends itself as a data source for aiding tasks such as knowledge base augmentation, where data fusion techniques are required to address the inherent characteristics of markup data, such as its redundancy, heterogeneity and lack of links. Future directions are concerned with the exploitation of the complementary nature of markup data and traditional knowledge graphs. The final publication is available at Springer via http://dx.doi.org/ 10.1007/978-3-319-53640-8_1.

Access State: Open Access

Retrieval, Crawling and Fusion of Entity-centric Data on the Web - [accepted Version]

Bookmarks

Search in field:

Recently searched for: