Data Lake is a term denoting a repository storing heterogeneous data, both structured and unstructured, resulting in a flexible organization that allows Data Lake users to reorganize and integrate dynamically the information they need according to the required query or analysis. The success of its implementation depends on many factors, notably the distributed storage, the kind of media deployed, the data access protocols and the network used. However, flaws in the design might become evident only in a later phase of the system development, causing significant delays in complex projects. This article presents an application of queuing networks modeling technique to detect significant issues, such as bottlenecks and performance degradation, for different workload scenarios.

Barbierato, E., Gribaudo, M., Serazzi, G., Tanca, L., Performance Evaluation of a Data Lake Architecture via Modeling Techniques, in European Workshop on Performance Engineering International Conference on Analytical and Stochastic Modeling Techniques and Applications, (Tokio, 09-10 December 2021), Springer Science and Business Media Deutschland GmbH, Berlin 2021:<<LECTURE NOTES IN COMPUTER SCIENCE>>,13104 115-130. [10.1007/978-3-030-91825-5_7] [http://hdl.handle.net/10807/202854]

Performance Evaluation of a Data Lake Architecture via Modeling Techniques

Barbierato, E.
Primo
Writing – Original Draft Preparation
;
2021

Abstract

Data Lake is a term denoting a repository storing heterogeneous data, both structured and unstructured, resulting in a flexible organization that allows Data Lake users to reorganize and integrate dynamically the information they need according to the required query or analysis. The success of its implementation depends on many factors, notably the distributed storage, the kind of media deployed, the data access protocols and the network used. However, flaws in the design might become evident only in a later phase of the system development, causing significant delays in complex projects. This article presents an application of queuing networks modeling technique to detect significant issues, such as bottlenecks and performance degradation, for different workload scenarios.
2021
Inglese
European Workshop on Performance Engineering International Conference on Analytical and Stochastic Modeling Techniques and Applications
17th European Performance Engineering Workshop, EPEW 2021, and the 26th International Conference on Analytical and Stochastic Modelling Techniques and Applications, ASMTA 2021
Tokio
9-dic-2021
10-dic-2021
978-3-030-91825-5
Springer Science and Business Media Deutschland GmbH
Barbierato, E., Gribaudo, M., Serazzi, G., Tanca, L., Performance Evaluation of a Data Lake Architecture via Modeling Techniques, in European Workshop on Performance Engineering International Conference on Analytical and Stochastic Modeling Techniques and Applications, (Tokio, 09-10 December 2021), Springer Science and Business Media Deutschland GmbH, Berlin 2021:<<LECTURE NOTES IN COMPUTER SCIENCE>>,13104 115-130. [10.1007/978-3-030-91825-5_7] [http://hdl.handle.net/10807/202854]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/202854
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 1
social impact