An interpretable machine learning system for colorectal cancer diagnosis from pathology slides.

Neto, Pedro C; Montezuma, Diana; Oliveira, Sara P; Oliveira, Domingos; Fraga, João; Monteiro, Ana; Monteiro, João; Ribeiro, Liliana; Gonçalves, Sofia; Reinhard, Stefan; Zlobec, Inti; Pinto, Isabel M; Cardoso, Jaime S (2024). An interpretable machine learning system for colorectal cancer diagnosis from pathology slides. NPJ precision oncology, 8(56) Springer Nature 10.1038/s41698-024-00539-4

[img]
Preview
Text
s41698-024-00539-4.pdf - Published Version
Available under License Creative Commons: Attribution (CC-BY).

Download (3MB) | Preview

Considering the profound transformation affecting pathology practice, we aimed to develop a scalable artificial intelligence (AI) system to diagnose colorectal cancer from whole-slide images (WSI). For this, we propose a deep learning (DL) system that learns from weak labels, a sampling strategy that reduces the number of training samples by a factor of six without compromising performance, an approach to leverage a small subset of fully annotated samples, and a prototype with explainable predictions, active learning features and parallelisation. Noting some problems in the literature, this study is conducted with one of the largest WSI colorectal samples dataset with approximately 10,500 WSIs. Of these samples, 900 are testing samples. Furthermore, the robustness of the proposed method is assessed with two additional external datasets (TCGA and PAIP) and a dataset of samples collected directly from the proposed prototype. Our proposed method predicts, for the patch-based tiles, a class based on the severity of the dysplasia and uses that information to classify the whole slide. It is trained with an interpretable mixed-supervision scheme to leverage the domain knowledge introduced by pathologists through spatial annotations. The mixed-supervision scheme allowed for an intelligent sampling strategy effectively evaluated in several different scenarios without compromising the performance. On the internal dataset, the method shows an accuracy of 93.44% and a sensitivity between positive (low-grade and high-grade dysplasia) and non-neoplastic samples of 0.996. On the external test samples varied with TCGA being the most challenging dataset with an overall accuracy of 84.91% and a sensitivity of 0.996.

Item Type:

Journal Article (Original Article)

Division/Institute:

04 Faculty of Medicine > Service Sector > Institute of Pathology
04 Faculty of Medicine > Service Sector > Institute of Pathology > Translational Research Unit

UniBE Contributor:

Reinhard, Stefan, Zlobec, Inti

Subjects:

500 Science > 570 Life sciences; biology
600 Technology > 610 Medicine & health

ISSN:

2397-768X

Publisher:

Springer Nature

Language:

English

Submitter:

Pubmed Import

Date Deposited:

06 Mar 2024 09:52

Last Modified:

06 Mar 2024 10:31

Publisher DOI:

10.1038/s41698-024-00539-4

PubMed ID:

38443695

BORIS DOI:

10.48350/193885

URI:

https://boris.unibe.ch/id/eprint/193885

Actions (login required)

Edit item Edit item
Provide Feedback