IRIS UniCatt

Big Data are generally huge quantities of digital information accrued automatically and/or merged from several sources and rarely result from properly planned population surveys. A Big Dataset is herein conceived as a collection of information concerning a nite population. Since the anal- ysis of an entire Big Dataset can require enormous computational eort, we suggest selecting a sample of observations and using this sampling information to achieve the inferential goal. Instead of the design-based survey sampling approach (which relates to the estimation of summary nite population measures, such as means, totals, proportions) we con- sider the model-based sampling approach, which involves inference about parameters of a super-population model. This model is assumed to have generated the nite population values, i.e. the Big Dataset. Given a super-population model we can apply the theory of optimal design to draw a sample from the Big Dataset which contains the majority of in- formation about the unknown parameters of interest. In addition, since a Big Dataset might provide poor information despite its size, from the def- inition of eciency of a design we suggest a device to measure the quality of the Big Data.

Deldossi, L., Tommasi, C., Optimal Design of Experiments and Model-based survey sampling in Big-Data, in Programme and Abstracts, 19th Annual ENBIS Conference, Budapest, 2-4 september 2019, (Budapest (Ungheria), 02-04 September 2019), Jens Bischoff, Agnes Backhausz, Rossella Berni, Sonja Kuhnt, Lluıs Marco- Almagro, Antonio Pievatolo, Marco P. Seabra dos Reis and Murat Caner Testik, Budapest, Hungary 2019:2019 37-37 [http://hdl.handle.net/10807/147206]

Optimal Design of Experiments and Model-based survey sampling in Big-Data

Deldossi, Laura^Primo;Tommasi C.^Secondo

2019

Abstract

Big Data are generally huge quantities of digital information accrued automatically and/or merged from several sources and rarely result from properly planned population surveys. A Big Dataset is herein conceived as a collection of information concerning a nite population. Since the anal- ysis of an entire Big Dataset can require enormous computational eort, we suggest selecting a sample of observations and using this sampling information to achieve the inferential goal. Instead of the design-based survey sampling approach (which relates to the estimation of summary nite population measures, such as means, totals, proportions) we con- sider the model-based sampling approach, which involves inference about parameters of a super-population model. This model is assumed to have generated the nite population values, i.e. the Big Dataset. Given a super-population model we can apply the theory of optimal design to draw a sample from the Big Dataset which contains the majority of in- formation about the unknown parameters of interest. In addition, since a Big Dataset might provide poor information despite its size, from the def- inition of eciency of a design we suggest a device to measure the quality of the Big Data.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2019
			
	Lingua del contenuto
	
				Inglese
			
	Titolo del volume che raccoglie gli atti
	
				Programme and Abstracts, 19th Annual ENBIS Conference, Budapest, 2-4 september 2019
			
	Denominazione evento
	
				19th Annual ENBIS Conference
			
	Luogo dell'evento
	
				Budapest (Ungheria)
			
	Data inizio evento
	
				2-set-2019
			
	Data fine evento
	
				4-set-2019
			
	ISBN del volume
	
				9789634891468
			
	Editore
	
				Jens Bischoff, Agnes Backhausz, Rossella Berni, Sonja Kuhnt, Lluıs Marco- Almagro, Antonio Pievatolo, Marco P. Seabra dos Reis and Murat Caner Testik
			
	Citazione
	
				Deldossi, L., Tommasi, C.,  Optimal Design of Experiments and Model-based survey sampling in Big-Data, in Programme and Abstracts, 19th Annual ENBIS Conference, Budapest, 2-4 september 2019, (Budapest (Ungheria),  02-04 September 2019), Jens Bischoff, Agnes Backhausz, Rossella Berni, Sonja Kuhnt, Lluıs Marco- Almagro, Antonio Pievatolo, Marco P. Seabra dos Reis and Murat Caner Testik, Budapest, Hungary 2019:2019 37-37 [http://hdl.handle.net/10807/147206]
			
	Appare nelle tipologie:
	
				Atti di Convegno, Congresso, Giornate di studio, ecc., Workshop (in volume)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/147206

Citazioni

ND

ND

ND

social impact