In this paper we present an unsupervised, graph-based approach for Word Sense Discrimination. Given a set of text sentences, a word co-occurrence graph is derived and a distance based on Jaccard index is defined on it; subsequently, the new distance is used to cluster the neighbour nodes of ambiguous terms using the concept of “gangplanks” as edges that separate denser regions (“islands”) in the graph. The proposed approach has been evaluated on a real data set, showing promising performance in Word Sense Discrimination.

L’obiettivo di questo articolo è descrivere un approccio di clustering non supervisionato e basato su grafi per individuare e discriminare i differenti sensi che un termine può assumere all’interno di un testo. Partendo da un grafo di cooccorrenze, vi definiamo una distanza fra nodi e applichiamo un algoritmo basato sulle “passerelle”, cioè archi che separano regioni dense (“isole”) all’interno del grafo. Discutiamo i risultati ottenuti su un insieme di dati composto da tweet.

Cecchini, F. M., Fersini, E., Word Sense Discrimination: A Gangplank Algorithm, Comunicazione, in Proceedings of the second Italian conference on Computational Linguistics CLiC-it 2015, (Fondazione Bruno Kessler, Trento, 03-04 December 2015), aAccademia University Press, Torino 2015: 77-81 [http://hdl.handle.net/10807/122108]

Word Sense Discrimination: A Gangplank Algorithm

Cecchini, Flavio Massimiliano;
2015

Abstract

In this paper we present an unsupervised, graph-based approach for Word Sense Discrimination. Given a set of text sentences, a word co-occurrence graph is derived and a distance based on Jaccard index is defined on it; subsequently, the new distance is used to cluster the neighbour nodes of ambiguous terms using the concept of “gangplanks” as edges that separate denser regions (“islands”) in the graph. The proposed approach has been evaluated on a real data set, showing promising performance in Word Sense Discrimination.
2015
Inglese
Proceedings of the second Italian conference on Computational Linguistics CLiC-it 2015
CLiC-it 2015
Fondazione Bruno Kessler, Trento
Comunicazione
3-dic-2015
4-dic-2015
978-88-99200-62-6
aAccademia University Press
Cecchini, F. M., Fersini, E., Word Sense Discrimination: A Gangplank Algorithm, Comunicazione, in Proceedings of the second Italian conference on Computational Linguistics CLiC-it 2015, (Fondazione Bruno Kessler, Trento, 03-04 December 2015), aAccademia University Press, Torino 2015: 77-81 [http://hdl.handle.net/10807/122108]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/122108
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact