This paper explores the employment of LLMs, specifically of Mistral-Nemo, in the semi-automatic population of the Ancient Greek WordNet synsets. Several approaches are investigated: zero-shot, few-shots, and fine-tuning. The results are compared against an English baseline. Zero-shot approach yields the highest accuracy, while fine-tuning leads to the highest number of potential synonyms. Our analysis also reveals that polysemy and PoS play a role in the model’s performance, as the highest scores are registered for polysemous words and for verbs and nouns. The results are encouraging for the application of such approaches in a human-in-the-loop scenario, since human validation still proves crucial in ensuring the accuracy of the results

Marchesi, B., Clementelli, A., Maurizio Mammarella, A., Zampetta, S., Biagetti, E., Brigada Villa, L., Mastellari, V., Ginevra, R., Roberta Combei, C., Zanchi, C., Towards the Semi-Automated Population of the Ancient Greek WordNet, in Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025), (Cagliari (Italia), 24-26 September 2025), CEUR Workshop Proceedings (CEUR-WS.org), Cagliari (Italia) 2025: 647-658 [https://hdl.handle.net/10807/328641]

Towards the Semi-Automated Population of the Ancient Greek WordNet

Ginevra, Riccardo;
2025

Abstract

This paper explores the employment of LLMs, specifically of Mistral-Nemo, in the semi-automatic population of the Ancient Greek WordNet synsets. Several approaches are investigated: zero-shot, few-shots, and fine-tuning. The results are compared against an English baseline. Zero-shot approach yields the highest accuracy, while fine-tuning leads to the highest number of potential synonyms. Our analysis also reveals that polysemy and PoS play a role in the model’s performance, as the highest scores are registered for polysemous words and for verbs and nouns. The results are encouraging for the application of such approaches in a human-in-the-loop scenario, since human validation still proves crucial in ensuring the accuracy of the results
2025
Inglese
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
Cagliari (Italia)
24-set-2025
26-set-2025
979-12-243-0587-3
CEUR Workshop Proceedings (CEUR-WS.org)
Marchesi, B., Clementelli, A., Maurizio Mammarella, A., Zampetta, S., Biagetti, E., Brigada Villa, L., Mastellari, V., Ginevra, R., Roberta Combei, C., Zanchi, C., Towards the Semi-Automated Population of the Ancient Greek WordNet, in Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025), (Cagliari (Italia), 24-26 September 2025), CEUR Workshop Proceedings (CEUR-WS.org), Cagliari (Italia) 2025: 647-658 [https://hdl.handle.net/10807/328641]
File in questo prodotto:
File Dimensione Formato  
2025. Atti Clic-it 11.pdf

accesso aperto

Tipologia file ?: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 3.14 MB
Formato Adobe PDF
3.14 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/328641
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact