Capturing analytical intents from text

Other authors

Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació

Universitat Politècnica de Catalunya. Intelligent Data Science and Artificial Intelligence

Universitat Politècnica de Catalunya. inSSIDE - integrated Software, Services, Information and Data Engineering

Publication date

2024

Abstract

The ability to extract valuable information from data is crucial for organizations and individuals who want to remain competitive in a constantly evolving data-driven environment. However, some of them lack the skills required to appropriately leverage the existing data analytics tools and methods. This problem is aggravated when the users are domain-experts but completely unfamiliar with data analytics terminology, as existing assistant tools, such as AutoML or Intelligent Discovery Assistants, require them to state their analytical intent (i.e., the type of data analysis they want to perform). To address this problem, we propose to capture the underlying analytical intent from textual problem descriptions by leveraging Large Language Models (LLMs). To this end, we propose a hierarchical categorization of analytical intents, along with a data collection methodology to obtain analytical problem descriptions for all of them in order to validate different approaches that aim to extract such intents from text. Next, we compare the performance of state-of-the-art approaches with LLMs, and then study the performance of different LLMs based on their characteristics and the impact of the source of validation data. Finally, we develop a prototype to showcase how our method could interact with existing AutoML systems.


Gerard Pons is supported by the EU’s Horizon Programme call, under Grant Agreement No. 101093164 (ExtremeXP), and Besim Bilalli is partially supported by the DOGO4ML project, funded by the Spanish Ministerio de Ciencia i Innovación under the funding scheme PID2020-117191RB-I00/AEI/10.13039/501100011033.


Peer Reviewed


Postprint (author's final draft)

Document Type

Conference report

Language

English

Publisher

Springer

Related items

https://link.springer.com/chapter/10.1007/978-3-031-70421-5_8

info:eu-repo/grantAgreement/EC/HE/101093164/EU/EXPeriment driven and user eXPerience oriented analytics for eXtremely Precise outcomes and decisions/ExtremeXP

info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-117191RB-I00/ES/DESARROLLO, OPERATIVA Y GOBERNANZA DE DATOS PARA SISTEMAS SOFTWARE BASADOS EN APRENDIZAJE AUTOMATICO/

Recommended citation

This citation was generated automatically.

Rights

Open Access

This item appears in the following Collection(s)

E-prints [72986]