Information retrieval in an infodemic: the case of COVID-19 publications.

Teodoro, Douglas; Ferdowsi, Sohrab; Borissov, Nikolay; Kashani, Elham; Vicente Alvarez, David; Copara, Jenny; Gouareb, Racha; Naderi, Nona; Amini, Poorya (2021). Information retrieval in an infodemic: the case of COVID-19 publications. Journal of medical internet research, 23(9), e30161. Centre of Global eHealth Innovation 10.2196/30161

[img]
Preview
Text
Teodoro_JMedInternetRes_2021.pdf - Published Version
Available under License Creative Commons: Attribution (CC-BY).

Download (1MB) | Preview

BACKGROUND

The coronavirus disease (COVID-19) global health crisis has led to an exponential surge in the published scientific literature. In the attempt to tackle the pandemic, extremely large COVID-19-related corpora are being created, sometimes with inaccurate information, which is no longer at scale of human analyses.

OBJECTIVE

In the context of searching for scientific evidence in the deluge of COVID-19-related literature, we present an information retrieval methodology for effective identification of relevant sources to answer biomedical queries posed using natural language.

METHODS

Our multi-stage retrieval methodology combines probabilistic weighting models and re-ranking algorithms based on deep neural architectures to boost the ranking of relevant documents. Similarity of COVID-19 queries are compared to documents and a series of post-processing methods are applied to the initial ranking list to improve the match between the query and the biomedical information source and boost the position of relevant documents.

RESULTS

The methodology was evaluated in the context of the TREC-COVID challenge, achieving competitive results with the top-ranking teams participating in the competition. Particularly, the combination of bag-of-words and deep neural language models significantly outperformed a BM25-based baseline, retrieving on average 83% of relevant documents in the top 20.

CONCLUSIONS

These results indicate that multi-stage retrieval supported by deep learning could enhance identification of literature for COVID-19-related questions posed using natural language.

CLINICALTRIAL

Item Type:

Journal Article (Original Article)

Division/Institute:

04 Faculty of Medicine > Pre-clinic Human Medicine > Department of Clinical Research (DCR)
04 Faculty of Medicine > Service Sector > Institute of Pathology
04 Faculty of Medicine > Service Sector > Institute of Pathology > Tumour Pathology

UniBE Contributor:

Borissov, Nikolay, Kashani, Elham, Amini, Poorya

ISSN:

1439-4456

Publisher:

Centre of Global eHealth Innovation

Funders:

[198] Innosuisse - Swiss Innovation Agency

Language:

English

Submitter:

Andrea Flükiger-Flückiger

Date Deposited:

20 Aug 2021 09:33

Last Modified:

20 Feb 2024 14:16

Publisher DOI:

10.2196/30161

PubMed ID:

34375298

BORIS DOI:

10.48350/158358

URI:

https://boris.unibe.ch/id/eprint/158358

Actions (login required)

Edit item Edit item
Provide Feedback