Data heterogeneity and irregularity are key characteristics of big data applications that often overwhelm the existing software and hardware infrastructures. In such context, the flexibility and elasticity provided by the cloud computing paradigm offer a natural approach to cost-effectively adapting the allocated resources to the application's current needs. Yet, the same characteristics impose extra challenges to predicting the performance of cloud-based big data applications, a central step in proper management and planning. This paper explores two modeling approaches for performance prediction of cloud-based big data applications. We evaluate a queuing-based analytical model and a novel fast ad-hoc simulator in various scenarios based on different applications and infrastructure setups. Our results show that our approaches can predict average application execution times with $26 %$ relative error in the very worst case and about 12% on average. Moreover, our simulator provides performance estimates 70 times faster than state of the art simulation tools.
Ardagna, D., Barbierato, E., Evangelinou, A., Gianniti, E., Gribaudo, M., Pinto, T. B. M., Guimarães, A., Couto Da Silva, A. P., Almeida, J. M., Performance Prediction of Cloud-Based Big Data Applications, <<THE JOURNAL OF SUPERCOMPUTING>>, 2021; (2): 192-199. [doi:10.1145/3184407.3184420] [http://hdl.handle.net/10807/202851]
Performance Prediction of Cloud-Based Big Data Applications
Barbierato, Enrico
Writing – Original Draft Preparation
;
2018
Abstract
Data heterogeneity and irregularity are key characteristics of big data applications that often overwhelm the existing software and hardware infrastructures. In such context, the flexibility and elasticity provided by the cloud computing paradigm offer a natural approach to cost-effectively adapting the allocated resources to the application's current needs. Yet, the same characteristics impose extra challenges to predicting the performance of cloud-based big data applications, a central step in proper management and planning. This paper explores two modeling approaches for performance prediction of cloud-based big data applications. We evaluate a queuing-based analytical model and a novel fast ad-hoc simulator in various scenarios based on different applications and infrastructure setups. Our results show that our approaches can predict average application execution times with $26 %$ relative error in the very worst case and about 12% on average. Moreover, our simulator provides performance estimates 70 times faster than state of the art simulation tools.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.