This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments.
Sprugnoli, R., Caselli, T., Tonelli, S., Moretti, G., The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts, in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, (Valencia (Spagna), 03-07 April 2017), Association for Computational Linguistics, Valencia (Spagna) 2017:2 260-266. [10.18653/v1/e17-2042] [http://hdl.handle.net/10807/132848]
The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts
Sprugnoli, Rachele
Primo
;
2017
Abstract
This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments.File | Dimensione | Formato | |
---|---|---|---|
E17-2042.pdf
accesso aperto
Tipologia file ?:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
115.74 kB
Formato
Adobe PDF
|
115.74 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.