IRIS UniCatt

Gaussian mixture models (GMM) are the most-widely employed approach to perform model-based clustering of continuous features. Grievously, with the increasing availability of high-dimensional datasets, their direct applicability is put at stake: GMMs suffer from the curse of dimensionality issue, as the number of parameters grows quadratically with the number of variables. To this extent, a methodological link between Gaussian mixtures and Gaussian graphical models has recently been established in order to provide a framework for performing penalized model-based clustering in presence of large precision matrices. Notwithstanding, current methodologies do not account for the fact that groups may be under or over-connected, thus implicitly assuming similar levels of sparsity across clusters. We overcome this limitation by defining data-driven and component specific penalty factors, automatically accounting for different degrees of connections within groups. A real data experiment on handwritten digits recognition showcases the validity of our proposal.

Casa, A., Cappozzo, A., Fop, M., Penalized Model-Based Clustering with Group-Dependent Shrinkage Estimation, Comunicazione, in Building Bridges between Soft and Statistical Methodologies for Data Science, (Valladolid, 14-16 September 2022), Springer, Valladolid 2023:1433 73-78. 10.1007/978-3-031-15509-3_10 [https://hdl.handle.net/10807/309186]

Penalized Model-Based Clustering with Group-Dependent Shrinkage Estimation

Casa, A;Cappozzo, Andrea;Fop, M

2023

Abstract

Gaussian mixture models (GMM) are the most-widely employed approach to perform model-based clustering of continuous features. Grievously, with the increasing availability of high-dimensional datasets, their direct applicability is put at stake: GMMs suffer from the curse of dimensionality issue, as the number of parameters grows quadratically with the number of variables. To this extent, a methodological link between Gaussian mixtures and Gaussian graphical models has recently been established in order to provide a framework for performing penalized model-based clustering in presence of large precision matrices. Notwithstanding, current methodologies do not account for the fact that groups may be under or over-connected, thus implicitly assuming similar levels of sparsity across clusters. We overcome this limitation by defining data-driven and component specific penalty factors, automatically accounting for different degrees of connections within groups. A real data experiment on handwritten digits recognition showcases the validity of our proposal.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2023
			
	Lingua del contenuto
	
				Inglese
			
	Titolo del volume che raccoglie gli atti
	
				Building Bridges between Soft and Statistical Methodologies for Data Science
			
	Denominazione evento
	
				International Conference on Soft Methods in Probability and Statistics (SMPS)
			
	Luogo dell'evento
	
				Valladolid
			
	Tipo di contributo
	
				Comunicazione
			
	Data inizio evento
	
				14-set-2022
			
	Data fine evento
	
				16-set-2022
			
	ISBN della pubblicazione
	
				978-3-031-15508-6
			
	Editore
	
				Springer
			
	DOI del contributo
	
				https://dx.doi.org/10.1007/978-3-031-15509-3_10
			
	Citazione
	
				Casa, A., Cappozzo, A., Fop, M., Penalized Model-Based Clustering with Group-Dependent Shrinkage Estimation,  Comunicazione, in Building Bridges between Soft and Statistical Methodologies for Data Science, (Valladolid,  14-16 September 2022), Springer, Valladolid 2023:1433 73-78. 10.1007/978-3-031-15509-3_10 [https://hdl.handle.net/10807/309186]
			
	Appare nelle tipologie:
	
				Paper, Selected paper, Contributed paper, Working paper, Poster, Poster paper, Comunicazione, Relazione (in volume)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/309186

Citazioni

ND

ND

0

social impact