An encoder-decoder network for direct image reconstruction on sinograms of a long axial field of view PET.

Ma, Ruiyao; Hu, Jiaxi; Sari, Hasan; Xue, Song; Mingels, Clemens; Viscione, Marco; Kandarpa, Venkata Sai Sundar; Li, Wei Bo; Visvikis, Dimitris; Qiu, Rui; Rominger, Axel; Li, Junli; Shi, Kuangyu (2022). An encoder-decoder network for direct image reconstruction on sinograms of a long axial field of view PET. European journal of nuclear medicine and molecular imaging, 49(13), pp. 4464-4477. Springer-Verlag 10.1007/s00259-022-05861-2

[img] Text
Ma2022_Article_AnEncoder-decoderNetworkForDir.pdf - Published Version
Restricted to registered users only
Available under License Publisher holds Copyright.

Download (4MB) | Request a copy


Deep learning is an emerging reconstruction method for positron emission tomography (PET), which can tackle complex PET corrections in an integrated procedure. This paper optimizes the direct PET reconstruction from sinogram on a long axial field of view (LAFOV) PET.


This paper proposes a novel deep learning architecture to reduce the biases during direct reconstruction from sinograms to images. This architecture is based on an encoder-decoder network, where the perceptual loss is used with pre-trained convolutional layers. It is trained and tested on data of 80 patients acquired from recent Siemens Biograph Vision Quadra long axial FOV (LAFOV) PET/CT. The patients are randomly split into a training dataset of 60 patients, a validation dataset of 10 patients, and a test dataset of 10 patients. The 3D sinograms are converted into 2D sinogram slices and used as input to the network. In addition, the vendor reconstructed images are considered as ground truths. Finally, the proposed method is compared with DeepPET, a benchmark deep learning method for PET reconstruction.


Compared with DeepPET, the proposed network significantly reduces the root-mean-squared error (NRMSE) from 0.63 to 0.6 (p < 0.01) and increases the structural similarity index (SSIM) and peak signal-to-noise ratio (PSNR) from 0.93 to 0.95 (p < 0.01) and from 82.02 to 82.36 (p < 0.01), respectively. The reconstruction time is approximately 10 s per patient, which is shortened by 23 times compared with the conventional method. The errors of mean standardized uptake values (SUVmean) for lesions between ground truth and the predicted result are reduced from 33.5 to 18.7% (p = 0.03). In addition, the error of max SUV is reduced from 32.7 to 21.8% (p = 0.02).


The results demonstrate the feasibility of using deep learning to reconstruct images with acceptable image quality and short reconstruction time. It is shown that the proposed method can improve the quality of deep learning-based reconstructed images without additional CT images for attenuation and scattering corrections. This study demonstrated the feasibility of deep learning to rapidly reconstruct images without additional CT images for complex corrections from actual clinical measurements on LAFOV PET. Despite improving the current development, AI-based reconstruction does not work appropriately for untrained scenarios due to limited extrapolation capability and cannot completely replace conventional reconstruction currently.

Item Type:

Journal Article (Original Article)


04 Faculty of Medicine > Department of Radiology, Neuroradiology and Nuclear Medicine (DRNN) > Clinic of Nuclear Medicine

UniBE Contributor:

Ma, Ruiyao, Hu, Jiaxi, Xue, Song, Mingels, Clemens, Rominger, Axel Oliver, Shi, Kuangyu


600 Technology > 610 Medicine & health








Pubmed Import

Date Deposited:

18 Jul 2022 10:34

Last Modified:

05 Dec 2022 16:21

Publisher DOI:


PubMed ID:


Uncontrolled Keywords:

Deep learning Image reconstruction Long axial field of view PET




Actions (login required)

Edit item Edit item
Provide Feedback