Exploration and adaptation of large language models for specialized domains

Medientyp: Sonstige Veröffentlichung; Elektronische Hochschulschrift; Dissertation; E-Book

Titel: Exploration and adaptation of large language models for specialized domains

Beteiligte: van Aken, Betty [Verfasser:in]

Erschienen: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover, 2023

Ausgabe: published Version

Sprache: Englisch

DOI: https://doi.org/10.15488/15781; https://doi.org/10.18653/v1/2021.eacl-main.75; https://doi.org/10.18653/v1/W18-5105; https://doi.org/10.1145/3357384.3358028; https://doi.org/10.1145/3366424.3383542; https://doi.org/10.18653/v1/2021.nlpmc-1.5; https://doi.org/10.18653/v1/2022.clinicalnlp-1.7

Schlagwörter: Textklassifikation ; explainability ; text classification ; natural language processing ; Sprachmodelle ; Domänenanpassung ; Automatisierte Sprachverarbeitung ; domain adaptation ; large language models ; Erklärbarkeit

Entstehung:

Anmerkungen: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.

Beschreibung: Large language models have transformed the field of natural language processing (NLP). Their improved performance on various NLP benchmarks makes them a promising tool—also for the application in specialized domains. Such domains are characterized by highly trained professionals with particular domain expertise. Since these experts are rare, improving the efficiency of their work with automated systems is especially desirable. However, domain-specific text resources hold various challenges for NLP systems. These challenges include distinct language, noisy and scarce data, and a high level of variation. Further, specialized domains present an increased need for transparent systems since they are often applied in high stakes settings. In this dissertation, we examine whether large language models (LLMs) can overcome some of these challenges and propose methods to effectively adapt them to domain-specific requirements. We first investigate the inner workings and abilities of LLMs and show how they can fill the gaps that are present in previous NLP algorithms for specialized domains. To this end, we explore the sources of errors produced by earlier systems to identify which of them can be addressed by using LLMs. Following this, we take a closer look at how information is processed within Transformer-based LLMs to better understand their capabilities. We find that their layers encode different dimensions of the input text. Here, the contextual vector representation, and the general language knowledge learned during pre-training are especially beneficial for solving complex and multi-step tasks common in specialized domains. Following this exploration, we propose solutions for further adapting LLMs to the requirements of domain-specific tasks. We focus on the clinical domain, which incorporates many typical challenges found in specialized domains. We show how to improve generalization by integrating different domain-specific resources into our models. We further analyze the behavior of the produced models and propose ...

Zugangsstatus: Freier Zugang

Rechte-/Nutzungshinweise: Namensnennung (CC BY)

Exploration and adaptation of large language models for specialized domains - [published Version]

Merkliste

Nur in Feld suchen:

Zuletzt gesuchte Begriffe: