Data linkage can be used to combine values of the variable of interest from a national survey with values of auxiliary variables obtained from another source, such as a population register, for use in small area estimation. However, linkage errors can induce bias when fitting regression models; moreover, they can create non-representative outliers in the linked data in addition to the presence of potential representative outliers. In this paper, we adopt a secondary analyst’s point of view, assuming that limited information is available on the linkage process, and develop small area estimators based on linear mixed models and M-quantile models to accommodate linked data containing a mix of both types of outliers. We illustrate the properties of these small area estimators, as well as estimators of their mean squared error, by means of model-based and design-based simulation experiments. We further illustrate the proposed methodology by applying it to linked data from the European Survey on Income and Living Conditions and the Italian integrated archive of economic and demographic micro data in order to obtain estimates of the average equivalised income for labour market areas in central Italy.
Salvati, N., Fabrizi, E., Ranalli, M. G., Chambers, R. L., Small area estimation with linked data, <<JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B STATISTICAL METHODOLOGY>>, 2021; 83 (1): 78-107. [doi:10.1111/rssb.12401] [http://hdl.handle.net/10807/171001]
Small area estimation with linked data
Fabrizi, Enrico;
2021
Abstract
Data linkage can be used to combine values of the variable of interest from a national survey with values of auxiliary variables obtained from another source, such as a population register, for use in small area estimation. However, linkage errors can induce bias when fitting regression models; moreover, they can create non-representative outliers in the linked data in addition to the presence of potential representative outliers. In this paper, we adopt a secondary analyst’s point of view, assuming that limited information is available on the linkage process, and develop small area estimators based on linear mixed models and M-quantile models to accommodate linked data containing a mix of both types of outliers. We illustrate the properties of these small area estimators, as well as estimators of their mean squared error, by means of model-based and design-based simulation experiments. We further illustrate the proposed methodology by applying it to linked data from the European Survey on Income and Living Conditions and the Italian integrated archive of economic and demographic micro data in order to obtain estimates of the average equivalised income for labour market areas in central Italy.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.