• Media type: Text; Electronic Conference Proceeding; E-Article
  • Title: LeMe-PT: A Medical Package Leaflet Corpus for Portuguese
  • Contributor: Simões, Alberto [Author]; Gamallo, Pablo [Author]
  • imprint: Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021
  • Language: English
  • DOI: https://doi.org/10.4230/OASIcs.SLATE.2021.10
  • Keywords: information extractiom ; word embeddings ; drug corpora
  • Origination:
  • Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Description: The current trend on natural language processing is the use of machine learning. This is being done on every field, from summarization to machine translation. For these techniques to be applied, resources are needed, namely quality corpora. While there are large quantities of corpora for the Portuguese language, there is the lack of technical and focused corpora. Therefore, in this article we present a new corpus, built from drug package leaflets. We describe its structure and contents, and discuss possible exploration directions.
  • Access State: Open Access