IRIS PubliCatt

This paper proposes a new algorithm for an automatic feature selection procedure in High Dimensional Graphical Models. The algorithm, called Best-Path Algorithm (BPA), rests on a filter method and performs feature selection based on mutual information. Over the last years, filter methods have been successfully employed to reduce the size of the input dataset and retain, at the same time, the relevant feature information for modelling and classification problems. However, the extant filter algorithms are mostly heuristic or require high computational effort. The BPA overcomes these drawbacks by taking advantage of the links between variables brought to the fore by the Edwards's algorithm. Once the High Dimensional Graphical Model, depicting the probabilistic structure of the variables, is determined, the BPA selects the best subset of features by analyzing its path-steps. The path-step that includes the variables with the most predictive power for the target one is then determined via the computation of the entropy correlation coefficient. This index, being based on the notion of (symmetric) Kullback-Leibler divergence, is closely connected to the mutual information that the path-step variables share with that of interest. The BPA application to simulated and real-word benchmark datasets highlights its potential and greater effectiveness compared to alternative extant methods.

Riso, L., Zoia, M., Nava, C. R., Feature selection based on the best-path algorithm in high dimensional graphical models, <<INFORMATION SCIENCES>>, 2023; (649): 1-24 [https://hdl.handle.net/10807/252468]

Feature selection based on the best-path algorithm in high dimensional graphical models

Riso, Luigi^Primo;Zoia, Maria^Secondo;Nava, Consuelo Rubina^Ultimo

2023

Abstract

This paper proposes a new algorithm for an automatic feature selection procedure in High Dimensional Graphical Models. The algorithm, called Best-Path Algorithm (BPA), rests on a filter method and performs feature selection based on mutual information. Over the last years, filter methods have been successfully employed to reduce the size of the input dataset and retain, at the same time, the relevant feature information for modelling and classification problems. However, the extant filter algorithms are mostly heuristic or require high computational effort. The BPA overcomes these drawbacks by taking advantage of the links between variables brought to the fore by the Edwards's algorithm. Once the High Dimensional Graphical Model, depicting the probabilistic structure of the variables, is determined, the BPA selects the best subset of features by analyzing its path-steps. The path-step that includes the variables with the most predictive power for the target one is then determined via the computation of the entropy correlation coefficient. This index, being based on the notion of (symmetric) Kullback-Leibler divergence, is closely connected to the mutual information that the path-step variables share with that of interest. The BPA application to simulated and real-word benchmark datasets highlights its potential and greater effectiveness compared to alternative extant methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2023
			
	Lingua del contenuto
	
				Inglese
			
	Nome del periodico
	
				INFORMATION SCIENCES
			
	URL alternativo
	
				https://www.sciencedirect.com/science/article/abs/pii/S0020025523011866
			
	Citazione
	
				Riso, L., Zoia, M., Nava, C. R., Feature selection based on the best-path algorithm in high dimensional graphical models, <<INFORMATION SCIENCES>>, 2023;  (649): 1-24 [https://hdl.handle.net/10807/252468]
			
	Appare nelle tipologie:
	
				Articolo in rivista, Nota a sentenza

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/252468

Citazioni

ND

ND

ND

social impact