IRIS UniCatt

This paper describes the first edition of EvaLatin, a campaign totally devoted to the evaluation of NLP tools for Latin. The two shared tasks proposed in EvaLatin 2020, i. e. Lemmatization and Part-of-Speech tagging, are aimed at fostering research in the field of language technologies for Classical languages. The shared dataset consists of texts taken from the Perseus Digital Library, processed with UDPipe models and then manually corrected by Latin experts. The training set includes only prose texts by Classical authors. The test set, alongside with prose texts by the same authors represented in the training set, also includes data relative to poetry and to the Medieval period. This also allows us to propose the Cross-genre and Cross-time subtasks for each task, in order to evaluate the portability of NLP tools for Latin across different genres and time periods. The results obtained by the participants for each task and subtask are presented and discussed.

Sprugnoli, R., Passarotti, M. C., Cecchini, F. M., Pellegrini, M., Overview of the EvaLatin 2020 Evaluation Campaign, in Proceedings of LT4HALA 2020 Workshop - 1st Workshop on Language Technologies for Historical and Ancient Languages, satellite event to the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), (Marsiglia, 12-12 May 2020), European Language Resources Association (ELRA), Paris 2020: 105-110 [http://hdl.handle.net/10807/151970]

Overview of the EvaLatin 2020 Evaluation Campaign

Sprugnoli, Rachele;Passarotti, Marco Carlo;Cecchini, Flavio Massimiliano;Pellegrini, Matteo

2020

Abstract

This paper describes the first edition of EvaLatin, a campaign totally devoted to the evaluation of NLP tools for Latin. The two shared tasks proposed in EvaLatin 2020, i. e. Lemmatization and Part-of-Speech tagging, are aimed at fostering research in the field of language technologies for Classical languages. The shared dataset consists of texts taken from the Perseus Digital Library, processed with UDPipe models and then manually corrected by Latin experts. The training set includes only prose texts by Classical authors. The test set, alongside with prose texts by the same authors represented in the training set, also includes data relative to poetry and to the Medieval period. This also allows us to propose the Cross-genre and Cross-time subtasks for each task, in order to evaluate the portability of NLP tools for Latin across different genres and time periods. The results obtained by the participants for each task and subtask are presented and discussed.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2020
			
	Lingua del contenuto
	
				Inglese
			
	Titolo del volume che raccoglie gli atti
	
				Proceedings of LT4HALA 2020 Workshop - 1st Workshop on Language Technologies for Historical and Ancient Languages, satellite event to the Twelfth International Conference on Language Resources and Evaluation (LREC 2020)
			
	Denominazione evento
	
				1st Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA)
			
	Luogo dell'evento
	
				Marsiglia
			
	Data inizio evento
	
				12-mag-2020
			
	Data fine evento
	
				12-mag-2020
			
	ISBN del volume
	
				979-10-95546-53-5
			
	Editore
	
				European Language Resources Association (ELRA)
			
	URL alternativo
	
				https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/LT4HALAbook.pdf
			
	Citazione
	
				Sprugnoli, R., Passarotti, M. C., Cecchini, F. M., Pellegrini, M.,  Overview of the EvaLatin 2020 Evaluation Campaign, in Proceedings of LT4HALA 2020 Workshop - 1st Workshop on Language Technologies for Historical and Ancient Languages, satellite event to the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), (Marsiglia,  12-12 May 2020), European Language Resources Association (ELRA), Paris 2020: 105-110 [http://hdl.handle.net/10807/151970]
			
	Appare nelle tipologie:
	
				Atti di Convegno, Congresso, Giornate di studio, ecc., Workshop (in volume)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/151970

Citazioni

ND

ND

ND

social impact