Institut Català de la Salut
[Li H] Shanghai University of Sport, Shanghai, China. [Jiang Z] School of Clinical Medicine, Beijing Tsinghua Changgung Hospital, Tsinghua Medicine, Tsinghua University, Beijing, China. Beijing Visual Science and Translational Eye Research Institute (BERI), Tsinghua Medicine, Tsinghua University, Beijing, China. [Guan Z, Bao Y, Liu Y, Hu T] Shanghai Belt and Road International Joint Laboratory of Intelligent Prevention and Treatment for Metabolic Diseases, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai, China. [Simó R] Centro de Investigación Biomédica en Red de Diabetes y Enfermedades Metabólicas Asociadas (CIBERDEM), Instituto de Salud Carlos III, Madrid, Spain. Grup de Recerca en Diabetis i Metabolisme, Vall d’Hebron Institut de Recerca (VHIR), Barcelona, Spain. Universitat Autònoma de Barcelona, Bellaterra, Spain
Vall d'Hebron Barcelona Hospital Campus
2025-04-01T11:59:34Z
2025-04-01T11:59:34Z
2025-03-30
Diabetes training; Large language models; Primary diabetes care
Formació en diabetis; Grans models de llenguatge; Atenció primària de diabetis
Formación en diabetes; Grandes modelos de lenguaje; Atención primaria de diabetes
Diabetes poses a considerable global health challenge, with varying levels of diabetes knowledge among healthcare professionals, highlighting the importance of diabetes training. Large Language Models (LLMs) provide new insights into diabetes training, but their performance in diabetes-related queries remains uncertain, especially outside the English language like Chinese. We first evaluated the performance of ten LLMs: ChatGPT-3.5, ChatGPT-4.0, Google Bard, LlaMA-7B, LlaMA2-7B, Baidu ERNIE Bot, Ali Tongyi Qianwen, MedGPT, HuatuoGPT, and Chinese LlaMA2-7B on diabetes-related queries, based on the Chinese National Certificate Examination for Primary Diabetes Care in China (NCE-CPDC) and the English Specialty Certificate Examination in Endocrinology and Diabetes of Membership of the Royal College of Physicians of the United Kingdom. Second, we assessed the training of primary care physicians (PCPs) without and with the assistance of ChatGPT-4.0 in the NCE-CPDC examination to ascertain the reliability of LLMs as medical assistants. We found that ChatGPT-4.0 outperformed other LLMs in the English examination, achieving a passing accuracy of 62.50%, which was significantly higher than that of Google Bard, LlaMA-7B, and LlaMA2-7B. For the NCE-CPFC examination, ChatGPT-4.0, Ali Tongyi Qianwen, Baidu ERNIE Bot, Google Bard, MedGPT, and ChatGPT-3.5 successfully passed, whereas LlaMA2-7B, HuatuoGPT, Chinese LLaMA2-7B, and LlaMA-7B failed. ChatGPT-4.0 (84.82%) surpassed all PCPs and assisted most PCPs in the NCE-CPDC examination (improving by 1 %–6.13%). In summary, LLMs demonstrated outstanding competence for diabetes-related questions in both the Chinese and English language, and hold great potential to assist future diabetes training for physicians globally.
This work was supported by the Noncommunicable Chronic Diseases-National Science and Technology Major Project (2023ZD0509202 and 2023ZD0509201), National Natural Science Foundation of China (62077037,8238810007, 82022012, 81870598, 62272298 and 82388101), the National Key Research and Development Program of China (2022YFC2502800 and 2022YFC2407000), the Shanghai Municipal Key Clinical Specialty, Shanghai Research Center for Endocrine and Metabolic Diseases (2022ZZ01002), the Chinese Academy of Engineering (2022-XY-08), the Innovative Research Team of High-level Local Universities in Shanghai (SHSMU-ZDCX20212700) and Beijing Natural Science Foundation (IS23096).
Article
Published version
English
Diabetis; Aprenentatge profund; Educació mèdica; Atenció primària; PHENOMENA AND PROCESSES::Mathematical Concepts::Algorithms::Artificial Intelligence::Machine Learning::Deep Learning; DISEASES::Nutritional and Metabolic Diseases::Metabolic Diseases::Glucose Metabolism Disorders::Diabetes Mellitus; NAMED GROUPS::Persons::Occupational Groups::Health Personnel::Physicians::Physicians, Primary Care; ANTHROPOLOGY, EDUCATION, SOCIOLOGY, AND SOCIAL PHENOMENA::Education::Inservice Training; FENÓMENOS Y PROCESOS::conceptos matemáticos::algoritmos::inteligencia artificial::aprendizaje automático::aprendizaje profundo; ENFERMEDADES::enfermedades nutricionales y metabólicas::enfermedades metabólicas::trastornos del metabolismo de la glucosa::diabetes mellitus; DENOMINACIONES DE GRUPOS::personas::grupos profesionales::personal sanitario::médicos::médicos de atención primaria; ANTROPOLOGÍA, EDUCACIÓN, SOCIOLOGÍA Y FENÓMENOS SOCIALES::educación::capacitación del personal
Elsevier
Science Bulletin;70(6)
https://doi.org/10.1016/j.scib.2025.01.034
Attribution 4.0 International
http://creativecommons.org/licenses/by/4.0/
Articles científics - VHIR [1655]