"Application and accuracy of artificial intelligence-derived large language models in patients with age related macular degeneration".

Ferro Desideri, Lorenzo; Roth, Janice; Zinkernagel, Martin; Anguita, Rodrigo (2023). "Application and accuracy of artificial intelligence-derived large language models in patients with age related macular degeneration". International journal of retina and vitreous, 9(1), p. 71. BMC 10.1186/s40942-023-00511-7

[img]
Preview
Text
s40942-023-00511-7.pdf - Published Version
Available under License Creative Commons: Attribution (CC-BY).

Download (736kB) | Preview

INTRODUCTION

Age-related macular degeneration (AMD) affects millions of people globally, leading to a surge in online research of putative diagnoses, causing potential misinformation and anxiety in patients and their parents. This study explores the efficacy of artificial intelligence-derived large language models (LLMs) like in addressing AMD patients' questions.

METHODS

ChatGPT 3.5 (2023), Bing AI (2023), and Google Bard (2023) were adopted as LLMs. Patients' questions were subdivided in two question categories, (a) general medical advice and (b) pre- and post-intravitreal injection advice and classified as (1) accurate and sufficient (2) partially accurate but sufficient and (3) inaccurate and not sufficient. Non-parametric test has been done to compare the means between the 3 LLMs scores and also an analysis of variance and reliability tests were performed among the 3 groups.

RESULTS

In category a) of questions, the average score was 1.20 (± 0.41) with ChatGPT 3.5, 1.60 (± 0.63) with Bing AI and 1.60 (± 0.73) with Google Bard, showing no significant differences among the 3 groups (p = 0.129). The average score in category b was 1.07 (± 0.27) with ChatGPT 3.5, 1.69 (± 0.63) with Bing AI and 1.38 (± 0.63) with Google Bard, showing a significant difference among the 3 groups (p = 0.0042). Reliability statistics showed Chronbach's α of 0.237 (range 0.448, 0.096-0.544).

CONCLUSION

ChatGPT 3.5 consistently offered the most accurate and satisfactory responses, particularly with technical queries. While LLMs displayed promise in providing precise information about AMD; however, further improvements are needed especially in more technical questions.

Item Type:

Journal Article (Original Article)

Division/Institute:

04 Faculty of Medicine > Department of Head Organs and Neurology (DKNS) > Clinic of Ophthalmology

UniBE Contributor:

Ferro Desideri, Lorenzo, Roth, Janice, Zinkernagel, Martin Sebastian, Anguita Henríquez, Rodrigo Andrés

Subjects:

600 Technology > 610 Medicine & health

ISSN:

2056-9920

Publisher:

BMC

Language:

English

Submitter:

Pubmed Import

Date Deposited:

20 Nov 2023 11:59

Last Modified:

20 Nov 2023 12:09

Publisher DOI:

10.1186/s40942-023-00511-7

PubMed ID:

37980501

Uncontrolled Keywords:

Artificial Intelligence Artificial intelligence in ophthalmology Dry macular degeneration LLMs Large language models Macular edema Wet macular degeneration

BORIS DOI:

10.48350/189164

URI:

https://boris.unibe.ch/id/eprint/189164

Actions (login required)

Edit item Edit item
Provide Feedback