Jagged competencies: Measuring the reliability of generative AI in academic research

Thomas, Llewellyn; Romasanta, Angelo Kenneth; Pujol Priego, Laia; Thomas, Llewellyn; Romasanta, Angelo Kenneth; Pujol Priego, Laia

doi:https://doi.org/10.1016/j.jbusres.2025.115804

Jagged competencies: Measuring the reliability of generative AI in academic research

To access the full text documents, please follow this link: https://hdl.handle.net/20.500.14342/6069

Author

Thomas, Llewellyn

Romasanta, Angelo Kenneth

Pujol Priego, Laia

Other authors

Universitat Ramon Llull. Esade

Publication date

2026-01

Abstract

Large Language Models (LLMs) are increasingly viewed as a valuable tool for academic research. While LLMs have some benefits, a ‘crisis of replicability’ in management scholarship mitigates against unrestrained use. In this paper we investigate the reproducibility of LLM analyses. We analyze three LLMs—ChatGPT, Claude and Mistral—over fifteen weeks, testing the consistency, accuracy and their interaction using the same prompts on the same data corpus. While our results demonstrate significant variations in reliability and consistency across the three LLMs, we also show that LLMs can exhibit deterministic and reliable behavior under specific, well-defined constraints. We argue that replicable LLM-based research will rely on understanding and validating the task-specific operational boundaries of the LLM. To ensure the responsible integration of LLMs into management research, we highlight a need for robust frameworks, transparency, ethical guidelines, and ongoing evaluation. We conclude with actionable guidance for management researchers.

Document Type

Article

Document version

Published version

Language

English

Subjects and keywords

Generative AI; LLM; Replication; Reproducibility; Consistency; Accuracy

Pages

14 p.

Publisher

Elsevier Inc.

Published in

Journal of Business Research, Vol. 203, 115804

Recommended citation

This citation was generated automatically.

Export

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

This item appears in the following Collection(s)

Esade [299]

Jagged competencies: Measuring the reliability of generative AI in academic research

Author

Other authors

Publication date

Share

Abstract

Document Type

Document version

Language

Subjects and keywords

Pages

Publisher

Published in

Recommended citation

Export

Rights

This item appears in the following Collection(s)