Clustering methods have typically found their application when dealing with continuous data. However, in many modern applications data consist of multiple categorical variables with no natural ordering. In the heuristic framework the problem of clustering these data is tackled by introducing suitable distances. In this work, we develop a model-based approach for clustering categorical data with nominal scale. Our approach is based on a mixture of distributions defined via the Hamming distance between categorical vectors. Maximum likelihood inference is delivered through an expectation-maximization algorithm. A simulation study is carried out to illustrate the proposed approach.

Filippi-Mazzola, E., Argiento, R., Paci, L., Clustering categorical data via Hamming distance, Contributed paper, in Book of short papers SIS 2021, (Pisa, 21-25 June 2021), Pearson, N/A 2021: 752-757 [http://hdl.handle.net/10807/203462]

Clustering categorical data via Hamming distance

Argiento, Raffaele;Paci, Lucia
2021

Abstract

Clustering methods have typically found their application when dealing with continuous data. However, in many modern applications data consist of multiple categorical variables with no natural ordering. In the heuristic framework the problem of clustering these data is tackled by introducing suitable distances. In this work, we develop a model-based approach for clustering categorical data with nominal scale. Our approach is based on a mixture of distributions defined via the Hamming distance between categorical vectors. Maximum likelihood inference is delivered through an expectation-maximization algorithm. A simulation study is carried out to illustrate the proposed approach.
2021
Inglese
Book of short papers SIS 2021
SIS 2021
Pisa
Contributed paper
21-giu-2021
25-giu-2021
9788891927361
Pearson
Filippi-Mazzola, E., Argiento, R., Paci, L., Clustering categorical data via Hamming distance, Contributed paper, in Book of short papers SIS 2021, (Pisa, 21-25 June 2021), Pearson, N/A 2021: 752-757 [http://hdl.handle.net/10807/203462]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/203462
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact