Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.

Tomarchio, S. D., Bagnato, L., Punzo, A., Model-based clustering via new parsimonious mixtures of heavy-tailed distributions, <<ASTA ADVANCES IN STATISTICAL ANALYSIS>>, 2022; 2022 (1): 1-33. [doi:10.1007/s10182-021-00430-8] [http://hdl.handle.net/10807/194601]

Model-based clustering via new parsimonious mixtures of heavy-tailed distributions

Bagnato, Luca;
2022

Abstract

Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.
2022
Inglese
Tomarchio, S. D., Bagnato, L., Punzo, A., Model-based clustering via new parsimonious mixtures of heavy-tailed distributions, <<ASTA ADVANCES IN STATISTICAL ANALYSIS>>, 2022; 2022 (1): 1-33. [doi:10.1007/s10182-021-00430-8] [http://hdl.handle.net/10807/194601]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/194601
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 4
social impact