High-performance computing techniques for efficient training and inference of AI models

Gumina, Mattia; Gumina, Mattia

High-performance computing techniques for efficient training and inference of AI models

Per accedir als documents amb el text complet, si us plau, seguiu el següent enllaç: https://hdl.handle.net/2117/448971

Autor/a

Gumina, Mattia

Altres autors/es

Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors

Herrero Zaragoza, Josep Ramon

Data de publicació

2025-07-03

Resum

This thesis investigates the computational efficiency and scalability of biologically inspired neural network architectures-Liquid Time-Constant (LTC) and Closed-form Continuous-time (CfC) models-under varying hardware and training configurations. While traditional sequence models such as RNNs, LSTMs, and Transformers have demonstrated strong performance in temporal tasks, they often suffer from high computational costs and limited scalability. LTC and CfC networks have emerged as promising alternatives due to their dynamic temporal modeling capabilities and lower parameter counts, but their real-world efficiency in training remains underexplored. To evaluate the training behavior and parallelization potential of these models, experiments were conducted on three datasets of increasing complexity: a synthetic sine-cosine signal, the Human Activity Recognition (HAR) dataset, and the Metro Interstate Traffic dataset. Both LTC and CfC were trained under various conditions, including single-CPU, single-GPU, and distributed multi-GPU environments using PyTorch's Distributed Data Parallel (DDP) framework. Metrics such as training time, accuracy, loss, GPU utilization, and memory consumption were systematically collected and analyzed. The results demonstrate that CfC models consistently train faster than LTC models across all scenarios, achieving comparable or higher accuracy, particularly in larger datasets. Moreover, distributed training significantly reduces training time for both models, with optimal gains observed at two GPUs, beyond which communication overheads begin to affect scalability. These findings highlight the practical trade-offs between architectural complexity and parallel efficiency, and support the use of biologically inspired models in high-performance AI pipelines.

Tipus de document

Master thesis

Llengua

Anglès

Matèries i paraules clau

Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic; High performance computing; Neural networks (Computer science); Xarxa neural líquida; Liquid Time-Constant (LTC); Closed-form Continuous-time (CfC); Formació distribuïda; Liquid Neural Networks; Distributed training; Càlcul intensiu (Informàtica); Xarxes neuronals (Informàtica)

Publicat per

Universitat Politècnica de Catalunya

Citació recomanada

Aquesta citació s'ha generat automàticament.

Exportar

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Drets

Open Access

Aquest element apareix en la col·lecció o col·leccions següent(s)

Treballs acadèmics [82545]

High-performance computing techniques for efficient training and inference of AI models

Autor/a

Altres autors/es

Data de publicació

Compartir

Resum

Tipus de document

Llengua

Matèries i paraules clau

Publicat per

Citació recomanada

Exportar

Drets

Aquest element apareix en la col·lecció o col·leccions següent(s)