The interoperability between lemmatized corpora of Latin and other resources that use the lemma as indexing key is hampered by the multiple lemmatization strategies that different projects adopt. In this paper we discuss how we tackle the challenges raised by harmonizing different lemmatization criteriain a project that aims to connect linguistic resources for Latin using the Linked Data paradigm. The paper introduces the architecture supporting an open-ended, lemma-based Knowledge Base, built to make textual and lexical resources for Latin interoperable. Particularly, the paper describes the inclusion into the Knowledge Base of its lexical basis, of a word formation lexicon and of a lemmatized and syntactically annotated corpus

Mambrini, F., Passarotti, M., Harmonizing Different Lemmatization Strategies for Building a Knowledge Base of Linguistic Resources for Latin, in Proceedings of the 13th Linguistic Annotation Workshop (LAW XIII). August 1, 2019. Florence, Italy, (Firenze, 01-01 August 2019), Association for Computational Linguistics, Firenze 2019: 71-80 [http://hdl.handle.net/10807/140575]

Harmonizing Different Lemmatization Strategies for Building a Knowledge Base of Linguistic Resources for Latin

Mambrini, Francesco;Passarotti, Marco
2019

Abstract

The interoperability between lemmatized corpora of Latin and other resources that use the lemma as indexing key is hampered by the multiple lemmatization strategies that different projects adopt. In this paper we discuss how we tackle the challenges raised by harmonizing different lemmatization criteriain a project that aims to connect linguistic resources for Latin using the Linked Data paradigm. The paper introduces the architecture supporting an open-ended, lemma-based Knowledge Base, built to make textual and lexical resources for Latin interoperable. Particularly, the paper describes the inclusion into the Knowledge Base of its lexical basis, of a word formation lexicon and of a lemmatized and syntactically annotated corpus
2019
Inglese
Proceedings of the 13th Linguistic Annotation Workshop (LAW XIII). August 1, 2019. Florence, Italy
13th Linguistic Annotation Workshop (LAW XIII)
Firenze
1-ago-2019
1-ago-2019
978-1-950737-38-3
Association for Computational Linguistics
Mambrini, F., Passarotti, M., Harmonizing Different Lemmatization Strategies for Building a Knowledge Base of Linguistic Resources for Latin, in Proceedings of the 13th Linguistic Annotation Workshop (LAW XIII). August 1, 2019. Florence, Italy, (Firenze, 01-01 August 2019), Association for Computational Linguistics, Firenze 2019: 71-80 [http://hdl.handle.net/10807/140575]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/140575
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 5
social impact