Universitat Politècnica de Catalunya. Departament de Ciències de la Computació
Opground
Turmo Borras, Jorge
Salord Quetglas, Lluís
2022-06-28
The purpose of this project is to design and implement a conversational agent that can conduct a prescreening interview to Opground users. A finite-state template-based task-oriented dialogue system was chosen to comply with Opground's requirements. This system is built on top of ConvLab-2 given its modularity and personalization of the agent's components. To improve the user experience when answering the interview questions a proof of concept (PoC) feedback generation model, Answer2Feedback, is proposed. Research for available datasets was done, but none was found that matched our task objective. Therefore, a job interview single-turn dataset is created from Opground's interviews. Then a subset of the dataset, that was cleaned and prepared to contain only Spanish questions about spoken languages to test the concept, was manually annotated. The dataset with the feedback was split into two. One with only annotations from one annotator and the other with annotations from two annotators. The latter dataset contains the former annotations. Two different Spanish pre-trained language models were used to experiment (t5 and mt5), where the mt5 was already fine-tuned for summarization. It was also experimented with the kind of input fed into the model. For two cases, the first case, with only the answer given by the candidate and the second case, with the answer plus the important entities from a business perspective extracted with a Named Entity Recognition (NER) model. Therefore, eight final fine-tuned models (model used: t5 or mt5; dataset used: one-annotator or two-annotator; input used: without entities or with entities) were compared in feedback generation. Extrinsic metrics BLEU and ROUGE, and intrinsic metric perplexity were used to evaluate the models. The evaluation results of the feedback were ambiguous depending on the metric. Finally, perplexity was selected as the metric to evaluate the quality of feedback generation after the results of the PoC indicated a correlation between perplexity and the naturalness of the feedback generated. Concluding, models fine-tuned with more instances and using entities in the input improve feedback generation.
Master thesis
English
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic; Natural language processing (Computer science); conversational agent; dialogue system; task-oriented; fine-tuning; text generation; agent de conversa; sistema de diàleg; orientat a tasques; afinació de models; generació de text; Tractament del llenguatge natural (Informàtica)
Universitat Politècnica de Catalunya
Restricted access - author's decision
Treballs acadèmics [82541]