Sie können Bookmarks mittels Listen verwalten, loggen Sie sich dafür bitte in Ihr SLUB Benutzerkonto ein.
Medientyp:
Bericht;
E-Book
Titel:
Scribosermo: fast speech-to-text models for German and other languages
Beteiligte:
Bermuth, Daniel
[VerfasserIn];
Poeppel, Alexander
[VerfasserIn];
Reif, Wolfgang
[VerfasserIn]
Erschienen:
Augsburg University Publication Server (OPUS), 2022-11-10
Sprache:
Englisch
DOI:
https://doi.org/10.48550/arXiv.2110.07982
Entstehung:
Anmerkungen:
Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
Beschreibung:
Recent Speech-to-Text models often require a large amount of hardware resources and are mostly trained in English. This paper presents Speech-to-Text models for German, as well as for Spanish and French with special features: (a) They are small and run in real-time on microcontrollers like a RaspberryPi. (b) Using a pretrained English model, they can be trained on consumer-grade hardware with a relatively small dataset. (c) The models are competitive with other solutions and outperform them in German. In this respect, the models combine advantages of other approaches, which only include a subset of the presented features. Furthermore, the paper provides a new library for handling datasets, which is focused on easy extension with additional datasets and shows an optimized way for transfer-learning new languages using a pretrained model from another language with a similar alphabet.