dc.contributor
Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
dc.contributor
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor
Barcelona Supercomputing Center
dc.contributor
Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.contributor.author
García Calatrava, Carlos
dc.contributor.author
Becerra Fontal, Yolanda
dc.contributor.author
Cucchietti, Fernando
dc.date.issued
2022-08-18
dc.identifier
Garcia, C.; Becerra, Y.; Cucchietti, F. A holistic scalability strategy for time series databases following cascading polyglot persistence. "Big data and cognitive computing", 18 Agost 2022, vol. 6, núm. 3, article 86, p. 1-30.
dc.identifier
https://hdl.handle.net/2117/372829
dc.identifier
10.3390/bdcc6030086
dc.description.abstract
Time series databases aim to handle big amounts of data in a fast way, both when introducing new data to the system, and when retrieving it later on. However, depending on the scenario in which these databases participate, reducing the number of requested resources becomes a further requirement. Following this goal, NagareDB and its Cascading Polyglot Persistence approach were born. They were not just intended to provide a fast time series solution, but also to find a great cost-efficiency balance. However, although they provided outstanding results, they lacked a natural way of scaling out in a cluster fashion. Consequently, monolithic approaches could extract the maximum value from the solution but distributed ones had to rely on general scalability approaches. In this research, we proposed a holistic approach specially tailored for databases following Cascading Polyglot Persistence to further maximize its inherent resource-saving goals. The proposed approach reduced the cluster size by 33%, in a setup with just three ingestion nodes and up to 50% in a setup with 10 ingestion nodes. Moreover, the evaluation shows that our scaling method is able to provide efficient cluster growth, offering scalability speedups greater than 85% in comparison to a theoretically 100% perfect scaling, while also ensuring data safety via data replication.
dc.description.abstract
This research was partly supported by the Grant Agreement No. 857191, by the Spanish Ministry of Science and Innovation (contract PID2019-107255GB) and by the Generalitat de Catalunya (contract 2017-SGR-1414).
dc.description.abstract
Peer Reviewed
dc.description.abstract
Postprint (published version)
dc.format
application/pdf
dc.publisher
Multidisciplinary Digital Publishing Institute (MDPI)
dc.relation
https://www.mdpi.com/2504-2289/6/3/86
dc.relation
info:eu-repo/grantAgreement/EC/H2020/857191/EU/Distributed Digital Twins for industrial SMEs: a big-data platform/IoTwins
dc.rights
http://creativecommons.org/licenses/by/4.0/
dc.rights
Attribution 4.0 International
dc.subject
Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació
dc.subject
Time-series analysis -- Data processing
dc.subject
Time series database
dc.subject
Cascading polyglot persistence
dc.subject
Resource-saving approach
dc.subject
Near real time
dc.subject
Series temporals -- Informàtica
dc.title
A holistic scalability strategy for time series databases following cascading polyglot persistence