IRIS UniCatt

This paper describes the organization and the results of the second edition of EvaLatin, the campaign for the evaluation of Natural Language Processing tools for Latin. The three shared tasks proposed in EvaLatin 2022, i. e. Lemmatization, Part-of-Speech Tagging and Features Identification, are aimed to foster research in the field of language technologies for Classical languages. The shared dataset consists of texts mainly taken from the LASLA corpus. More specifically, the training set includes only prose texts of the Classical period, whereas the test set is organized in three sub-tasks: a Classical sub-task on a prose text of an author not included in the training data, a Cross-genre sub-task on poetic and scientific texts, and a Cross-time sub-task on a text of the 15th century. The results obtained by the participants for each task and sub-task are presented and discussed.

Sprugnoli, R., Passarotti, M. C., Cecchini, F. M., Fantoli, M., Moretti, G., Overview of the EvaLatin 2022 Evaluation Campaign, in Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages, (Marseille, 25-25 June 2022), European Language Resources Association (ELRA), Marseille 2022: 183-188. [10.5281/zenodo.6655906] [http://hdl.handle.net/10807/210605]

Overview of the EvaLatin 2022 Evaluation Campaign

Sprugnoli, Rachele;Passarotti, Marco Carlo;Cecchini, Flavio Massimiliano;Fantoli Margherita;Moretti Giovanni

2022

Abstract

This paper describes the organization and the results of the second edition of EvaLatin, the campaign for the evaluation of Natural Language Processing tools for Latin. The three shared tasks proposed in EvaLatin 2022, i. e. Lemmatization, Part-of-Speech Tagging and Features Identification, are aimed to foster research in the field of language technologies for Classical languages. The shared dataset consists of texts mainly taken from the LASLA corpus. More specifically, the training set includes only prose texts of the Classical period, whereas the test set is organized in three sub-tasks: a Classical sub-task on a prose text of an author not included in the training data, a Cross-genre sub-task on poetic and scientific texts, and a Cross-time sub-task on a text of the 15th century. The results obtained by the participants for each task and sub-task are presented and discussed.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2022
			
	Lingua del contenuto
	
				Inglese
			
	Titolo del volume che raccoglie gli atti
	
				Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages
			
	Denominazione evento
	
				Second Workshop on Language Technologies for Historical and Ancient Languages
			
	Luogo dell'evento
	
				Marseille
			
	Data inizio evento
	
				25-giu-2022
			
	Data fine evento
	
				25-giu-2022
			
	ISBN del volume
	
				979-10-95546-78-8
			
	Editore
	
				European Language Resources Association (ELRA)
			
	DOI del contributo
	
				https://dx.doi.org/10.5281/zenodo.6655906
			
	URL alternativo
	
				http://www.lrec-conf.org/proceedings/lrec2022/workshops/LT4HALA/pdf/2022.lt4hala2022-1.29.pdf
			
	Citazione
	
				Sprugnoli, R., Passarotti, M. C., Cecchini, F. M., Fantoli, M., Moretti, G.,  Overview of the EvaLatin 2022 Evaluation Campaign, in Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages, (Marseille,  25-25 June 2022), European Language Resources Association (ELRA), Marseille 2022: 183-188. [10.5281/zenodo.6655906] [http://hdl.handle.net/10807/210605]
			
	Appare nelle tipologie:
	
				Atti di Convegno, Congresso, Giornate di studio, ecc., Workshop (in volume)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/210605

Citazioni

ND

ND

ND

social impact