This paper presents the publication as Linked Open Data of a set of coreference and anaphora annotations (called CorefLat) performed on a set of Latin texts. Annotations are made on texts already available as Linked Open Data as part of the LiLa Knowledge Base of interoperable linguistic resources for Latin. By adopting a lemma-centered architecture and established guidelines for annotation inspired by those of the GUM corpus, CorefLat systematically identifies and tags entities and mentions, creating relational links. The annotated corpus covers multiple periods and genres, including Augustine’s Confessiones, Plautus’ Curculio, Caesar’s De Bello Gallico, and Seneca’s Medea, ensuring a balanced dataset for broader linguistic analysis. The publication of CorefLat as Linked Open Data relies on an OWL ontology that extends the POWLA framework, thus enabling interoperability with diverse linguistic resources within LiLa. We detail how coreference relations, including phenomena such as anaphora, cataphora, split antecedents, and multiword units, are encoded through specialized classes and object properties.
Delfino, E., Leotta, R. G., Mambrini, F., Passarotti, M. C., Moretti, G., CorefLat. Coreference Resolution for Latin as Linked Open Data, in SemDH 2025: Second International Workshop of Semantic Digital Humanities. Co-located with ESWC 2025, June 02, 2025, Portoroz, Slovenia., (Portoroz, Slovenia, 02-02 June 2025), CEUR-WS.org, Portorož, Slovenia 2025:<<CEUR WORKSHOP PROCEEDINGS>>, N/A-N/A [https://hdl.handle.net/10807/322483]
CorefLat. Coreference Resolution for Latin as Linked Open Data
Leotta, Roberta GraziaCo-primo
;Mambrini, FrancescoCo-primo
;Passarotti, Marco CarloCo-primo
;Moretti, GiovanniCo-primo
2025
Abstract
This paper presents the publication as Linked Open Data of a set of coreference and anaphora annotations (called CorefLat) performed on a set of Latin texts. Annotations are made on texts already available as Linked Open Data as part of the LiLa Knowledge Base of interoperable linguistic resources for Latin. By adopting a lemma-centered architecture and established guidelines for annotation inspired by those of the GUM corpus, CorefLat systematically identifies and tags entities and mentions, creating relational links. The annotated corpus covers multiple periods and genres, including Augustine’s Confessiones, Plautus’ Curculio, Caesar’s De Bello Gallico, and Seneca’s Medea, ensuring a balanced dataset for broader linguistic analysis. The publication of CorefLat as Linked Open Data relies on an OWL ontology that extends the POWLA framework, thus enabling interoperability with diverse linguistic resources within LiLa. We detail how coreference relations, including phenomena such as anaphora, cataphora, split antecedents, and multiword units, are encoded through specialized classes and object properties.| File | Dimensione | Formato | |
|---|---|---|---|
|
paper_15.pdf
accesso aperto
Tipologia file ?:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
1.69 MB
Formato
Adobe PDF
|
1.69 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



