SETA: A suite-independent analytical framework

Home | About RECERCAT | Contact

Català | Castellano

All of RECERCAT

By Communities &
Collections By Defense Date By Authors By Titles By Subject

This Collection

By Defense Date By Authors By Titles By Subject

Statistics

View Statistics All RECERCAT

My RECERCAT

Other repositories directory

RECERCAT Home > Universitat Politècnica de Catalunya > Tesines i projectes i treballs de final de carrera > View document

To access the full text documents, please follow this link: http://hdl.handle.net/2117/106397

Title:	SETA: A suite-independent analytical framework
Author:	González Alonso, Pedro Javier
Other authors:	Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació; Abelló Gamazo, Alberto; Romero Moral, Óscar
Abstract:	Nowadays, business analytical users need agile processes spanning from the selection of relevant data from raw data sources to the generation of data structures prepared to serve as input for OLAP, Data Mining and/or other analytical tools. However, the wide range of analytical needs and the increasingly need of adaptive Business strategies discourages the use of the ’All-In-One’ existing suites (i.e., end-to-end Solutions from a single vendor). Oppositely, an agile approach suiteindependent is advisable to boost user’s independence from a specific vendor and the analytical capabilities enabled by combining several suites / tools according to the user’s needs. In this thesis we present and develop ’SETA’, a suite-independent agile analytical framework by proposing a novel approach combining rich metadata definition and automation components. As proof of validity, we instantiate the developed framework in a real-world project for the WHO Chagas Programme. This thesis introduces two main contributions. First, an approach to store and integrate a set of heterogeneous data sources into a flexible data store in some intermediate point between the classical Data Warehouse (DW) approaches and the recent Data Lake strategies. We argue that classical DW systems are too rigid to accommodate agile analytical pipelines, whereas Data Lakes and Big Data technologies are not suitable to much of today’s organizations. Thus, a novel approach combining both approaches is presented. Second, a rich definitional system to represent 1) the data components at Source, Global Schema and Domain levels, 2) the data mappings between this levels and 3) the final user analytical requirements. This definitional system provides a flexible view of the data schema at different levels and habilitates the automation of the target data schemas and the ETL to feed them.
Subject(s):	-Àrees temàtiques de la UPC::Informàtica -Data warehousing -Ontology -Data Lake -metamodel -non SQL -document stores -semantic awareness -multidimensional modeling -OLAP -ETL -OWL -Gestor de dades -Ontologia
Rights:
Document type:	Research/Master Thesis
Published by:	Universitat Politècnica de Catalunya
Share:

Show full item record

Accesibility | Legal note | Cookies Policy

Coordination

Supporters