Spatial econometrics is currently experiencing the Big Data revolution both in terms of the volume of data and the velocity with which they are accumulated. Regional data, employed traditionally in spatial econometric modeling, can be very large, with information that are increasingly available at a very fine resolution level such as census tracts, local markets, town blocks, regular grids or other small partitions of the territory. When dealing with spatial microeconometric models referred to the granular observations of the single economic agent, the number of observations available can be a lot higher. This paper reports the results of a systematic simulation study on the limits of the current methodologies when estimating spatial models with large datasets. In our study we simulate a Spatial Lag Model (SLM), we estimate it using Maximum Likelihood (ML), Two Stages Least Squares (2SLS) andBayesianestimator(B),andwetesttheirperformancesfordifferentsamplesizesanddifferentlevels of sparsity of the weight matrices. We considered three performance indicators, namely: computing time, storage required and accuracy of the estimators. The results show that using standard computer capabilities the analysis becomes prohibitive and unreliable when the sample size is greater than 70,000 evenforlowlevelsofsparsity. Thisresultsuggeststhatnewapproachesshouldbeintroducedtoanalyze the big datasets that are quickly becoming the new standard in spatial econometrics.
Arbia, G., Ghiringhelli, C., Antonietta, M., Estimation of spatial econometric linear models with large datasets: How big can spatial Big Data be?, <<REGIONAL SCIENCE AND URBAN ECONOMICS>>, 2019; 2019 (N/A): N/A-N/A. [doi:10.1016/j.regsciurbeco.2019.01.006] [http://hdl.handle.net/10807/132731]
Estimation of spatial econometric linear models with large datasets: How big can spatial Big Data be?
Arbia, Giuseppe
Primo
;Ghiringhelli, ChiaraSecondo
;
2019
Abstract
Spatial econometrics is currently experiencing the Big Data revolution both in terms of the volume of data and the velocity with which they are accumulated. Regional data, employed traditionally in spatial econometric modeling, can be very large, with information that are increasingly available at a very fine resolution level such as census tracts, local markets, town blocks, regular grids or other small partitions of the territory. When dealing with spatial microeconometric models referred to the granular observations of the single economic agent, the number of observations available can be a lot higher. This paper reports the results of a systematic simulation study on the limits of the current methodologies when estimating spatial models with large datasets. In our study we simulate a Spatial Lag Model (SLM), we estimate it using Maximum Likelihood (ML), Two Stages Least Squares (2SLS) andBayesianestimator(B),andwetesttheirperformancesfordifferentsamplesizesanddifferentlevels of sparsity of the weight matrices. We considered three performance indicators, namely: computing time, storage required and accuracy of the estimators. The results show that using standard computer capabilities the analysis becomes prohibitive and unreliable when the sample size is greater than 70,000 evenforlowlevelsofsparsity. Thisresultsuggeststhatnewapproachesshouldbeintroducedtoanalyze the big datasets that are quickly becoming the new standard in spatial econometrics.File | Dimensione | Formato | |
---|---|---|---|
118. arbia ghiringhelli mira RSUE.pdf
non disponibili
Tipologia file ?:
Versione Editoriale (PDF)
Licenza:
Non specificato
Dimensione
4.45 MB
Formato
Unknown
|
4.45 MB | Unknown | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.