IRIS UniCatt

This paper deals with the issue of concept-drift in machine learning in the context of high dimensional problems. In contrast to previous concept drift detection methods, this application does not depend on the machine learning model in use for a specific target variable, but rather, it attempts to assess the concept drift as an independent characteristic of the evolution of a dataset. This major achievement enables data to be tested for the presence of drift, independently of the specific problem at hand. This is extremely useful when the same dataset is utilized for different classifications simultaneously, as it is often the case in a business environment. Moreover, unlike previous approaches, this method does not require the re-testing of each new model; a strategy which could prove expensive in computational terms. The fundamental intention of this work is to make use of graphical models to elicit the visible structure of data and represent it as a network. Specifically, we investigate how a graphical model evolves by looking at the creation of new links, and the disappearance of existing ones, in different time periods. We perform this task in four steps. We compute the adjacency matrix of a graph in each period, we apply a function that maps each possible state of the adjacency matrix over time into a transition matrix. We use the information in the transition matrix to produce a metric to estimate the presence of a drift in the data. Eventually, we evaluate this method with both three real-world datasets and a synthetic one.(c) 2022 Elsevier Inc. All rights reserved.

Riso, L., Guerzoni, M., Concept drift estimation with graphical models, <<INFORMATION SCIENCES>>, 2022; 606 (606): 786-804. [doi:10.1016/j.ins.2022.05.056] [https://hdl.handle.net/10807/228811]

Concept drift estimation with graphical models

Riso, Luigi^Primo;Guerzoni Marco^Secondo

2022

Abstract

This paper deals with the issue of concept-drift in machine learning in the context of high dimensional problems. In contrast to previous concept drift detection methods, this application does not depend on the machine learning model in use for a specific target variable, but rather, it attempts to assess the concept drift as an independent characteristic of the evolution of a dataset. This major achievement enables data to be tested for the presence of drift, independently of the specific problem at hand. This is extremely useful when the same dataset is utilized for different classifications simultaneously, as it is often the case in a business environment. Moreover, unlike previous approaches, this method does not require the re-testing of each new model; a strategy which could prove expensive in computational terms. The fundamental intention of this work is to make use of graphical models to elicit the visible structure of data and represent it as a network. Specifically, we investigate how a graphical model evolves by looking at the creation of new links, and the disappearance of existing ones, in different time periods. We perform this task in four steps. We compute the adjacency matrix of a graph in each period, we apply a function that maps each possible state of the adjacency matrix over time into a transition matrix. We use the information in the transition matrix to produce a metric to estimate the presence of a drift in the data. Eventually, we evaluate this method with both three real-world datasets and a synthetic one.(c) 2022 Elsevier Inc. All rights reserved.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2022
			
	Lingua del contenuto
	
				Inglese
			
	Nome del periodico
	
				INFORMATION SCIENCES
			
	DOI del contributo
	
				https://dx.doi.org/10.1016/j.ins.2022.05.056
			
	Citazione
	
				Riso, L., Guerzoni, M., Concept drift estimation with graphical models, <<INFORMATION SCIENCES>>, 2022;  606 (606): 786-804. [doi:10.1016/j.ins.2022.05.056] [https://hdl.handle.net/10807/228811]
			
	Appare nelle tipologie:
	
				Articolo in rivista, Nota a sentenza

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/228811

Citazioni

ND

6

6

social impact