Effects of interacting with a large language model compared with a human coach on the clinical diagnostic process and outcomes among fourth-year medical students: study protocol for a prospective, randomised experiment using patient vignettes.

Kämmer, Juliane E; Hautz, Wolf E; Krummrey, Gert; Sauter, Thomas C; Penders, Dorothea; Birrenbach, Tanja; Bienefeld, Nadine (2024). Effects of interacting with a large language model compared with a human coach on the clinical diagnostic process and outcomes among fourth-year medical students: study protocol for a prospective, randomised experiment using patient vignettes. BMJ open, 14(7) BMJ Publishing Group 10.1136/bmjopen-2024-087469

[img]
Preview
Text
e087469.full.pdf - Published Version
Available under License Creative Commons: Attribution-Noncommercial (CC-BY-NC).

Download (1MB) | Preview

INTRODUCTION

Versatile large language models (LLMs) have the potential to augment diagnostic decision-making by assisting diagnosticians, thanks to their ability to engage in open-ended, natural conversations and their comprehensive knowledge access. Yet the novelty of LLMs in diagnostic decision-making introduces uncertainties regarding their impact. Clinicians unfamiliar with the use of LLMs in their professional context may rely on general attitudes towards LLMs more broadly, potentially hindering thoughtful use and critical evaluation of their input, leading to either over-reliance and lack of critical thinking or an unwillingness to use LLMs as diagnostic aids. To address these concerns, this study examines the influence on the diagnostic process and outcomes of interacting with an LLM compared with a human coach, and of prior training vs no training for interacting with either of these 'coaches'. Our findings aim to illuminate the potential benefits and risks of employing artificial intelligence (AI) in diagnostic decision-making.

METHODS AND ANALYSIS

We are conducting a prospective, randomised experiment with N=158 fourth-year medical students from Charité Medical School, Berlin, Germany. Participants are asked to diagnose patient vignettes after being assigned to either a human coach or ChatGPT and after either training or no training (both between-subject factors). We are specifically collecting data on the effects of using either of these 'coaches' and of additional training on information search, number of hypotheses entertained, diagnostic accuracy and confidence. Statistical methods will include linear mixed effects models. Exploratory analyses of the interaction patterns and attitudes towards AI will also generate more generalisable knowledge about the role of AI in medicine.

ETHICS AND DISSEMINATION

The Bern Cantonal Ethics Committee considered the study exempt from full ethical review (BASEC No: Req-2023-01396). All methods will be conducted in accordance with relevant guidelines and regulations. Participation is voluntary and informed consent will be obtained. Results will be published in peer-reviewed scientific medical journals. Authorship will be determined according to the International Committee of Medical Journal Editors guidelines.

Item Type:

Journal Article (Original Article)

Division/Institute:

04 Faculty of Medicine > Department of Intensive Care, Emergency Medicine and Anaesthesiology (DINA) > University Emergency Center

UniBE Contributor:

Kämmer, Juliane Eva, Hautz, Wolf, Sauter, Thomas Christian, Birrenbach, Tanja Nicole

Subjects:

600 Technology > 610 Medicine & health

ISSN:

2044-6055

Publisher:

BMJ Publishing Group

Language:

English

Submitter:

Pubmed Import

Date Deposited:

22 Jul 2024 16:17

Last Modified:

22 Jul 2024 16:25

Publisher DOI:

10.1136/bmjopen-2024-087469

PubMed ID:

39025818

Uncontrolled Keywords:

Artificial Intelligence Clinical Decision-Making Clinical Reasoning MEDICAL EDUCATION & TRAINING

BORIS DOI:

10.48350/199099

URI:

https://boris.unibe.ch/id/eprint/199099

Actions (login required)

Edit item Edit item
Provide Feedback