Titel:
Event-based stream classification framework – a supervised clustering approach for social media applications
Beteiligte:
Reuter, Timo
[Verfasser:in]
Erschienen:
Universitätsbibliothek, 2015
Sprache:
Englisch
Entstehung:
Anmerkungen:
Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
Beschreibung:
Reuter T. Event-based stream classification framework – a supervised clustering approach for social media applications . Bielefeld: Universitätsbibliothek; 2015. ; Events play a very prominent role in our lifes. Therefore many social media documents describe or are related to some event. However, it is difficult for a human to gather relevant information without any structure in these documents. The organization of social media documents with respect to events thus seems to be a promising approach to better manage and organize the ever-increasing amount of content that is shared using social media applications. It is a challenge to automatize this process so that incoming documents can be assigned to their corresponding event without any user intervention. In this dissertation we present an event-based stream classification framework that is able to classify a never-ending stream of social media data into a growing and evolving set of events. By doing this, we successfully perform the assignment of a social media item newly uploaded to some social media site to its corresponding event (if it already exists) or create a new event to which future data items can be assigned. We refer to this problem as the event detection problem and propose to use machine learning techniques to tackle it. We successfully address several key challenges that arise in this context: i) handling the data in a stream-based setting, i.e. addressing the challenges arising from the need to process a never-ending stream of data, ii) scaling to the data sizes and rates usually encountered in social media applications, and iii) tackling the new event detection problem, i.e. the problem of determining whether an incoming data item belongs to a new or to an already known event. We address these challenges through a classification approach allowing us to process the data in one single pass. Furthermore, we include a suitable candidate event retrieval step which retrieves a set of event candidates that the incoming data point is likely to belong ...