Learning how to robustly estimate camera pose in endoscopic videos.

Hayoz, Michel; Hahne, Christopher; Gallardo, Mathias; Candinas, Daniel; Kurmann, Thomas; Allan, Maximilian; Sznitman, Raphael (2023). Learning how to robustly estimate camera pose in endoscopic videos. International journal of computer assisted radiology and surgery, 18(7), pp. 1185-1192. Springer 10.1007/s11548-023-02919-w

[img]
Preview
Text
s11548-023-02919-w.pdf - Published Version
Available under License Creative Commons: Attribution (CC-BY).

Download (1MB) | Preview

PURPOSE

Surgical scene understanding plays a critical role in the technology stack of tomorrow's intervention-assisting systems in endoscopic surgeries. For this, tracking the endoscope pose is a key component, but remains challenging due to illumination conditions, deforming tissues and the breathing motion of organs.

METHOD

We propose a solution for stereo endoscopes that estimates depth and optical flow to minimize two geometric losses for camera pose estimation. Most importantly, we introduce two learned adaptive per-pixel weight mappings that balance contributions according to the input image content. To do so, we train a Deep Declarative Network to take advantage of the expressiveness of deep learning and the robustness of a novel geometric-based optimization approach. We validate our approach on the publicly available SCARED dataset and introduce a new in vivo dataset, StereoMIS, which includes a wider spectrum of typically observed surgical settings.

RESULTS

Our method outperforms state-of-the-art methods on average and more importantly, in difficult scenarios where tissue deformations and breathing motion are visible. We observed that our proposed weight mappings attenuate the contribution of pixels on ambiguous regions of the images, such as deforming tissues.

CONCLUSION

We demonstrate the effectiveness of our solution to robustly estimate the camera pose in challenging endoscopic surgical scenes. Our contributions can be used to improve related tasks like simultaneous localization and mapping (SLAM) or 3D reconstruction, therefore advancing surgical scene understanding in minimally invasive surgery.

Item Type:

Journal Article (Original Article)

Division/Institute:

10 Strategic Research Centers > ARTORG Center for Biomedical Engineering Research > ARTORG Center - AI in Medical Imaging Laboratory
04 Faculty of Medicine > Department of Gastro-intestinal, Liver and Lung Disorders (DMLL) > Clinic of Visceral Surgery and Medicine > Visceral Surgery
04 Faculty of Medicine > Department of Gastro-intestinal, Liver and Lung Disorders (DMLL) > Clinic of Visceral Surgery and Medicine
10 Strategic Research Centers > ARTORG Center for Biomedical Engineering Research

UniBE Contributor:

Hayoz, Michel, Hahne, Christopher, Gallardo, Mathias, Candinas, Daniel, Sznitman, Raphael

Subjects:

600 Technology > 610 Medicine & health
500 Science > 570 Life sciences; biology

ISSN:

1861-6429

Publisher:

Springer

Language:

English

Submitter:

Pubmed Import

Date Deposited:

16 May 2023 10:04

Last Modified:

20 Nov 2023 14:56

Publisher DOI:

10.1007/s11548-023-02919-w

PubMed ID:

37184768

Uncontrolled Keywords:

Camera pose estimation Deep declarative network Endoscopic surgery

BORIS DOI:

10.48350/182594

URI:

https://boris.unibe.ch/id/eprint/182594

Actions (login required)

Edit item Edit item
Provide Feedback