Clustering is the problem of partitioning data into a finite number, k, of homogeneous and separate groups, called clusters. A good choice of k is essential for obtaining meaningful clusters. The intraclass correlation coefficient r is frequently used to measure the degree of intragroup resemblance (for example of characteristics such as blood pressure, weight and height). The theory concerning r is well established for single variables analysis (Sheff`e, 1959; Rao, 1973). In this paper, this task is addressed by means of a multiple test procedure defining the optimal cluster solution under normality assumption of the involved variables. Relevant principal components are used to define a simplified multivariate test of null intraclass correlation procedure and the proposal of a new statistical stopping rule is evaluated.
Nai Ruscone, M., Boari, G., Use of Relevant Principal Components to Definea Simplified Multivarate Test Procedure ofOptimal Clutering, in Cladag 2013. 9th Meeting of the Classification and Data Analysis Group. Book of Abstracts, (Modena, 2013-09-18), Cleup, Modena 2013: 1-4 [http://hdl.handle.net/10807/53363]
Use of Relevant Principal Components to Define a Simplified Multivarate Test Procedure of Optimal Clutering
Nai Ruscone, Marta;Boari, Giuseppe
2013
Abstract
Clustering is the problem of partitioning data into a finite number, k, of homogeneous and separate groups, called clusters. A good choice of k is essential for obtaining meaningful clusters. The intraclass correlation coefficient r is frequently used to measure the degree of intragroup resemblance (for example of characteristics such as blood pressure, weight and height). The theory concerning r is well established for single variables analysis (Sheff`e, 1959; Rao, 1973). In this paper, this task is addressed by means of a multiple test procedure defining the optimal cluster solution under normality assumption of the involved variables. Relevant principal components are used to define a simplified multivariate test of null intraclass correlation procedure and the proposal of a new statistical stopping rule is evaluated.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.