The present work introduces a new Latin treebank that follows the Universal Dependencies (UD) annotation standard. The treebank is obtained from the automated conversion of the Late Latin Charter Treebank 2 (LLCT2), originally in the Prague Dependency Treebank (PDT) style. As this treebank consists of Early Medieval legal documents, its language variety differs considerably from both the Classical and Medieval learned varieties prevalent in the other currently available UD Latin treebanks. Consequently, besides significant phenomena from the perspective of diachronic linguistics, this treebank also poses several challenging technical issues for the current and future syntactic annotation of Latin in the UD framework. Some of the most relevant cases are discussed in depth, with comparisons between the original PDT and the resulting UD annotations. Additionally, an overview of the UD-style structure of the treebank is given, and some diachronic aspects of the transition from Latin to Romance languages are highlighted.

Cecchini, F. M., Korkiakangas, T., Passarotti, M., A New Latin Treebank for Universal Dependencies: Charters between Ancient Latin and Romance Languages, in Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), (Marseille, 11-16 May 2020), European Language Resources Association (ELRA), Paris 2020: 933-942. [10.5281/zenodo.3862154] [http://hdl.handle.net/10807/154885]

A New Latin Treebank for Universal Dependencies: Charters between Ancient Latin and Romance Languages

Cecchini, Flavio Massimiliano;Passarotti, Marco
2020

Abstract

The present work introduces a new Latin treebank that follows the Universal Dependencies (UD) annotation standard. The treebank is obtained from the automated conversion of the Late Latin Charter Treebank 2 (LLCT2), originally in the Prague Dependency Treebank (PDT) style. As this treebank consists of Early Medieval legal documents, its language variety differs considerably from both the Classical and Medieval learned varieties prevalent in the other currently available UD Latin treebanks. Consequently, besides significant phenomena from the perspective of diachronic linguistics, this treebank also poses several challenging technical issues for the current and future syntactic annotation of Latin in the UD framework. Some of the most relevant cases are discussed in depth, with comparisons between the original PDT and the resulting UD annotations. Additionally, an overview of the UD-style structure of the treebank is given, and some diachronic aspects of the transition from Latin to Romance languages are highlighted.
2020
Inglese
Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020)
Twelfth International Conference on Language Resources and Evaluation (LREC 2020)
Marseille
11-mag-2020
16-mag-2020
979-10-95546-34-4
European Language Resources Association (ELRA)
Cecchini, F. M., Korkiakangas, T., Passarotti, M., A New Latin Treebank for Universal Dependencies: Charters between Ancient Latin and Romance Languages, in Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), (Marseille, 11-16 May 2020), European Language Resources Association (ELRA), Paris 2020: 933-942. [10.5281/zenodo.3862154] [http://hdl.handle.net/10807/154885]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/154885
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 3
social impact