Automatic call classification using machine learning and advanced NLP approaches

Vives Garcia Del Real, José-Nicolás; Vives Garcia Del Real, José-Nicolás

Automatic call classification using machine learning and advanced NLP approaches

Per accedir als documents amb el text complet, si us plau, seguiu el següent enllaç: https://hdl.handle.net/2117/459338

Autor/a

Vives Garcia Del Real, José-Nicolás

Altres autors/es

Universitat Politècnica de Catalunya. Departament d'Enginyeria Electrònica

Moreno Eguilaz, Juan Manuel

Data de publicació

2026-01-26

Resum

This Bachelor’s Thesis addresses the development, training, and comparison of several Transformer-based models for the classification of pharmaceutical call transcriptions, distinguishing whether the patient reports an adverse event, AE, or not. The work aims to contribute to the automation of this process through natural language processing, NLP, and machine learning, ML. The project is developed using a dataset provided by a pharmaceutical company, consisting of a limited number of transcribed calls. These transcriptions are long and noisy, and they reflect real conversational language, which introduces additional challenges compared to more structured text sources. In this context, the proposed methodology defines a complete pipeline for data preparation and model training to address the task as a supervised binary classification problem. The study compares several BERT based architectures, including BERT base, BERT large, DistilBERT, RoBERTa, and ALBERT, with the goal of identifying which configuration performs best in this scenario. In addition, a final comparison is presented between the best Transformer based model and the best classical ML approach developed in parallel for the same problem, in order to assess which paradigm is more effective for AE detection in conversations. The selection of the final model is not based on a single metric, but on a multidimensional criterion that combines different aspects relevant to a safety critical application. The goal is to achieve overall model effectiveness, the ability to detect AE cases with high sensitivity, reliable generalization to unseen calls, and practical feasibility under computational cost constraints. The thesis presents the pipeline and the selected configuration that is most suitable for classifying and detecting adverse events in telephone call transcripts.

Tipus de document

Bachelor thesis

Llengua

Anglès

Matèries i paraules clau

Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial; Machine learning; Natural language processing (Computer science); Artificial intelligence--Medical applications; Aprenentatge automàtic; Tractament del llenguatge natural (Informàtica); Intel·ligència artificial--Aplicacions a la medicina

Publicat per

Universitat Politècnica de Catalunya

Citació recomanada

Aquesta citació s'ha generat automàticament.

Exportar

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Drets

Open Access

Aquest element apareix en la col·lecció o col·leccions següent(s)

Treballs acadèmics [82483]

Automatic call classification using machine learning and advanced NLP approaches

Autor/a

Altres autors/es

Data de publicació

Compartir

Resum

Tipus de document

Llengua

Matèries i paraules clau

Publicat per

Citació recomanada

Exportar

Drets

Aquest element apareix en la col·lecció o col·leccions següent(s)