This paper presents L-KD, a tool that relies on available linguistic and knowledge resources to perform keyphrase clustering and labelling. The aim of L-KD is to help finding and tracing themes in English and Italian text data, represented by groups of keyphrases and associated domains. We perform an evaluation of the top-ranked domains using the 20 Newsgroup dataset, and we show that 8 domains out of 10 match with manually assigned labels. This confirms the good accuracy of this approach, which does not require supervision.

Moretti, G., Sprugnoli, R., Tonelli, S., KD Strikes Back: from Keyphrases to Labelled Domains Using External Knowledge Sources, in Proceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) & Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2016), (Napoli, Italia, 05-07 December 2016), aAccademia University Press, Torino 2016:1749 216-221. [10.4000/books.aaccademia.1814] [http://hdl.handle.net/10807/132966]

KD Strikes Back: from Keyphrases to Labelled Domains Using External Knowledge Sources

Sprugnoli, Rachele;
2016

Abstract

This paper presents L-KD, a tool that relies on available linguistic and knowledge resources to perform keyphrase clustering and labelling. The aim of L-KD is to help finding and tracing themes in English and Italian text data, represented by groups of keyphrases and associated domains. We perform an evaluation of the top-ranked domains using the 20 Newsgroup dataset, and we show that 8 domains out of 10 match with manually assigned labels. This confirms the good accuracy of this approach, which does not require supervision.
2016
Inglese
Proceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) & Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2016)
Third Italian Conference on Computational Linguistics (CLiC-it 2016)
Napoli, Italia
5-dic-2016
7-dic-2016
9788899982089
aAccademia University Press
Moretti, G., Sprugnoli, R., Tonelli, S., KD Strikes Back: from Keyphrases to Labelled Domains Using External Knowledge Sources, in Proceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) & Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2016), (Napoli, Italia, 05-07 December 2016), aAccademia University Press, Torino 2016:1749 216-221. [10.4000/books.aaccademia.1814] [http://hdl.handle.net/10807/132966]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/132966
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact