Partanen, Niko
[VerfasserIn];
Rießler, Michael
[VerfasserIn];
Pirinen, Tommi A.
[VerfasserIn];
Kaalep, Heiki-Jaan
[VerfasserIn];
Tyers, Francis M.
[VerfasserIn];
Association for Computational Linguistics
[VerfasserIn]
Titel:
An OCR system for the Unified Northern Alphabet
Beteiligte:
Partanen, Niko
[VerfasserIn];
Rießler, Michael
[VerfasserIn];
Pirinen, Tommi A.
[VerfasserIn];
Kaalep, Heiki-Jaan
[VerfasserIn];
Tyers, Francis M.
[VerfasserIn];
Association for Computational Linguistics
[VerfasserIn]
Erschienen:
Association for Computational Linguistics, 2019
Sprache:
Englisch
ISBN:
978-1-948087-92-6
Entstehung:
Anmerkungen:
Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
Beschreibung:
Partanen N, Rießler M. An OCR system for the Unified Northern Alphabet. In: Pirinen TA, Kaalep H-J, Tyers FM, Association for Computational Linguistics, eds. The fifth International Workshop on Computational Linguistics for Uralic Languages . Tartu: Association for Computational Linguistics; 2019: 77-89. ; This paper presents experiments done in order to build a functional OCR model for the Unified Northern Alphabet. This writing system was used between 1931 and 1937 for 16 (Uralic and non-Uralic) minority languages spoken in the Soviet Union. The character accuracy of the developed model reaches more than 98% and clearly shows cross-linguistic applicability. The tests described here therefore also include general guidelines for the amount of training data needed to boot-strap an OCR system under similar conditions.