Ensemble of deep learning language models to support the creation of living systematic reviews for the COVID-19 literature.

Knafou, Julien; Haas, Quentin; Borissov, Nikolay; Counotte, Michel; Low, Nicola; Imeri, Hira; Ipekçi, Aziz Mert; Buitrago Garcia, Diana; Heron, Leonie; Amini, Poorya; Teodoro, Douglas (2023). Ensemble of deep learning language models to support the creation of living systematic reviews for the COVID-19 literature. Systematic Reviews, 12(1), p. 94. BioMed Central 10.1186/s13643-023-02247-9

Preview

Text
s13643-023-02247-9.pdf - Published Version
Available under License Creative Commons: Attribution (CC-BY).
Download (3MB) | Preview

BACKGROUND

The COVID-19 pandemic has led to an unprecedented amount of scientific publications, growing at a pace never seen before. Multiple living systematic reviews have been developed to assist professionals with up-to-date and trustworthy health information, but it is increasingly challenging for systematic reviewers to keep up with the evidence in electronic databases. We aimed to investigate deep learning-based machine learning algorithms to classify COVID-19-related publications to help scale up the epidemiological curation process.

METHODS

In this retrospective study, five different pre-trained deep learning-based language models were fine-tuned on a dataset of 6365 publications manually classified into two classes, three subclasses, and 22 sub-subclasses relevant for epidemiological triage purposes. In a k-fold cross-validation setting, each standalone model was assessed on a classification task and compared against an ensemble, which takes the standalone model predictions as input and uses different strategies to infer the optimal article class. A ranking task was also considered, in which the model outputs a ranked list of sub-subclasses associated with the article.

RESULTS

The ensemble model significantly outperformed the standalone classifiers, achieving a F1-score of 89.2 at the class level of the classification task. The difference between the standalone and ensemble models increases at the sub-subclass level, where the ensemble reaches a micro F1-score of 70% against 67% for the best-performing standalone model. For the ranking task, the ensemble obtained the highest recall@3, with a performance of 89%. Using an unanimity voting rule, the ensemble can provide predictions with higher confidence on a subset of the data, achieving detection of original papers with a F1-score up to 97% on a subset of 80% of the collection instead of 93% on the whole dataset.

CONCLUSION

This study shows the potential of using deep learning language models to perform triage of COVID-19 references efficiently and support epidemiological curation and review. The ensemble consistently and significantly outperforms any standalone model. Fine-tuning the voting strategy thresholds is an interesting alternative to annotate a subset with higher predictive confidence.

Item Type:	Journal Article (Original Article)
Division/Institute:	04 Faculty of Medicine > Pre-clinic Human Medicine > Department of Clinical Research (DCR) 04 Faculty of Medicine > Pre-clinic Human Medicine > Institute of Social and Preventive Medicine (ISPM)
UniBE Contributor:	Borissov, Nikolay, Counotte, Michel Jacques, Low, Nicola, Imeri, Hira, Ipekçi, Aziz Mert, Buitrago Garcia, Diana Carolina, Heron, Leonie, Amini, Poorya
Subjects:	600 Technology > 610 Medicine & health 300 Social sciences, sociology & anthropology > 360 Social problems & social services
ISSN:	2046-4053
Publisher:	BioMed Central
Funders:	[198] Innosuisse - Swiss Innovation Agency ; [4] Swiss National Science Foundation ; [222] Horizon 2020
Language:	English
Submitter:	Pubmed Import
Date Deposited:	06 Jun 2023 09:34
Last Modified:	20 Feb 2024 14:15
Publisher DOI:	10.1186/s13643-023-02247-9
PubMed ID:	37277872
Additional Information:	Open access funding provided by University of Geneva.
Uncontrolled Keywords:	COVID-19 Deep learning Language model Literature screening Living systematic review Text classification Transfer learning
BORIS DOI:	10.48350/183192
URI:	https://boris.unibe.ch/id/eprint/183192

Actions (login required)

Edit item

Ensemble of deep learning language models to support the creation of living systematic reviews for the COVID-19 literature.

Interest & Impact

Downloads

Citations

Search

Services

Actions (login required)

Item Type:

Division/Institute:

UniBE Contributor:

Subjects:

ISSN:

Publisher:

Funders:

Language:

Submitter:

Date Deposited:

Last Modified:

Publisher DOI:

PubMed ID:

Additional Information:

Uncontrolled Keywords:

BORIS DOI:

URI:

Actions (login required)