Abstract
Stereoscopic observation is a common foundation of medical image analysis and is generally achieved by 3D medical imaging based on settled scanners, such as CT and MRI, that are not as convenient as X-ray machines in some flexible scenarios. However, X-ray images can only provide perspective 2D observation and lack view in the third dimension. If 3D information can be deduced from X-ray images, it would broaden the application of X-ray machines. Focus on the above objective, this paper dedicates to the generation of pseudo 3D CT scans from non-parallel 2D perspective X-ray (PXR) views and proposes the Draw Sketch and Draw Flesh (DSDF) framework to first roughly predict the tissue distribution (Sketch) from PXR views and then render the tissue details (Flesh) from the tissue distribution and PXR views. Different from previous studies that focus only on partial locations, e.g., chest or neck, this study theoretically investigates the feasibility of head-to-leg reconstruction, i.e., generally applicable to any body parts. Experiments on 559 whole-body samples from 4 cohorts suggest that our DSDF can reconstruct more reasonable pseudo CT images than state-of-the-art methods and achieve promising results in both visualization and various downstream tasks. The source code and well-trained models are available a https://github.com/YongshengPan/WholeBodyXraytoCT.
Similar content being viewed by others
References
Amber Diagnostics. (2022). Basic Overview of the C-Arm Machine. https://www.amberusa.com/blog/basic-overview-of-the-c-arm-system. 9.5.
Batzolis, G., Stanczuk, J., Schönlieb, C. -B., & Etmann, C. (2021). Conditional image generation with score-based diffusion models. arXiv preprint[SPACE]arXiv:2111.13606.
Bayat, A., Sekuboyina, A. K., Paetzold, J. C., Payer, C., Štern, D., Urschler, M., Kirschke, J. S., & Menze, B. H. (2020). Inferring the 3D standing spine posture from 2D radiographs. In International Conference on Medical Image Computing and Computer-Assisted Intervention.
Chao, H., Shan, H., Homayounieh, F., Singh, R., Khera, R. D., Guo, H., Su, T., Wang, G., Kalra, M. K., & Yan, P. (2021). Deep learning predicts cardiovascular disease risks from lung cancer screening low dose computed tomography. Nature Communications, 12(1), 2963–2963.
Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A. L., & Zhou, Y. (2021). TransUNet: Transformers make strong encoders for medical image segmentation. arXiv preprint[SPACE]arXiv:2102.04306.
Chen, J., Zhang, Z., Xie, X., Li, Y., Xu, T., Ma, K., & Zheng, Y. (2022). Beyond mutual information: Generative adversarial network for domain adaptation using information bottleneck constraint. IEEE Transactions on Medical Imaging, 41, 595–607.
Chen, P., Yang, W., Wang, M., Sun, L., Hu, K., & Wang, S. (2021). Compressed domain deep video super-resolution. IEEE Transactions on Image Processing, 30, 7156–7169.
Chen, Z., Guo, L., Zhang, R., Fang, Z., He, X., & Wang, J. (2023). BX2S-Net: Learning to reconstruct 3D spinal structures from bi-planar X-ray images. Computers in Biology and Medicine, 154, 106615.
Chiang, T., Huang, Y., Chen, R., Huang, C., & Chang, R. (2019). Tumor detection in automated breast ultrasound using 3-D CNN and prioritized candidate aggregation. IEEE Transactions on Medical Imaging, 38(1), 240–249.
Cho, S., Lee, S., Lee, J., Lee, D., Kim, H., Ryu, J.-H., Jeong, K., Kim, K.-G., Yoon, K.-H., & Cho, S. (2021). A novel low-dose dual-energy imaging method for a fast-rotating gantry-type CT scanner. IEEE Transactions on Medical Imaging, 40(3), 1007–1020.
Chougule, V. N., Mulay, A. V., & Ahuja, B. B. (2017). Clinical case study: Spine modeling for minimum invasive spine surgeries (miss) using rapid prototyping. In International Conference on Precision, Meso, Mocro and Nano Engineering (COPEN) (pp. 96–102).
Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., et al. (2013). The Cancer Imaging Archive (TCIA): Maintaining and operating a public information repository. Journal of Digital Imaging, 26(6), 1045–1057.
Cohen, J. P., Luck, M., & Honari, S. (2018). Distribution matching losses can hallucinate features in medical image translation. arXiv:1805.08841.
Cretti, F. (2018). Assessment of occupational radiation dose in interventional settings. La Medicina del Lavoro, 109(1), 57.
Croitoru, F.-A., Hondru, V., Ionescu, R. T., & Shah, M. (2023). Diffusion models in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(9), 10850–10869.
Deak, Z., Grimm, J. M., Treitl, M., Geyer, L. L., Linsenmaier, U., Körner, M., Reiser, M., & Wirth, S. (2013). Filtered back projection, adaptive statistical iterative reconstruction, and a model-based iterative reconstruction in abdominal ct: An experimental clinical study. Radiology, 266(1), 197–206.
DenOtter, T. D., & Schubert, J. (2021). Hounsfield unit. In StatPearls. StatPearls Publishing, Treasure Island (FL). http://europepmc.org/books/NBK547721.
Fard, A. S., Reutens, D. C., & Vegh, V. (2021). CNNs and GANs in mri-based cross-modality medical image estimation. arXiv:2106.02198.
Flood, P. D. L., & Banks, S. A. (2018). Automated registration of 3-D knee implant models to fluoroscopic images using lipschitzian optimization. IEEE Transactions on Medical Imaging, 37(1), 326–335.
Gao, K., Gao, Y., He, H., Lu, D., Xu, L., & Li, J. (2022). Nerf: Neural radiance field in 3d vision, a comprehensive review. arXiv:2210.00379.
Ge, R., He, Y., Xia, C., Xu, C., Sun, W., Yang, G., Li, J., Wang, Z., Yu, H.-Z., Zhang, D., Chen, Y., Luo, L., Li, S., & Zhu, Y. (2022). X-ctrsnet: 3D cervical vertebra CT reconstruction and segmentation directly from 2D X-ray images. Knowl. Based Syst., 236, 107680.
Ghani, M. U., & Karl, W. C. (2020). Fast enhanced CT metal artifact reduction using data domain deep learning. IEEE Transactions on Computational Imaging, 6, 181–193.
Gui, J., Sun, Z., Wen, Y., Tao, D., & Ye, J. (2021). A review on generative adversarial networks: Algorithms, theory, and applications. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2021.3130191
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of wasserstein GANs. In NIPS.
Gupta, H., Jin, K. H., Nguyen, H. Q., McCann, M. T., & Unser, M. (2018). CNN-based projected gradient descent for consistent CT image reconstruction. IEEE Transactions on Medical Imaging, 37(6), 1440–1453.
He, D., Zhou, J., Shang, X., Tang, X., Luo, J., & Chen, S.-L. (2022). De-noising of photoacoustic microscopy images by attentive generative adversarial network. IEEE Transactions on Medical Imaging, 42, 1349–1362.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR (pp. 770–778).
He, Y., Schiele, B., & Fritz, M. (2020). Synthetic convolutional features for improved semantic segmentation. In European conference on computer vision (pp. 320–336).
Isola, P., Zhu, J. -Y., Zhou, T., & Efros, A. A. (2017) Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5967–5976).
Jecklin, S., Jancik, C., Farshad, M., Fürnstahl, P., & Esfandiari, H. (2022). X23-Dintraoperative 3D lumbar spine shape reconstruction based on sparse multi-view X-ray data. Journal of Imaging, 8(10), 271.
Jha, D., Smedsrud, P. H., Johansen, D., Lange, T., Johansen, H. D., Halvorsen, P., & Riegler, M. A. (2021). A comprehensive study on colorectal polyp segmentation with ResUNet++, conditional random field and test-time augmentation. IEEE Journal of Biomedical and Health Informatics, 25(6), 2029–2040.
Jiang, L., Zhang, M., Wei, R., Liu, B., Bai, X., & Zhou, F. (2021). Reconstruction of 3D ct from a single X-ray projection view using CVAE-GAN. In 2021 IEEE international conference on medical imaging physics and engineering (ICMIPE) (pp. 1–6).
Jonas, D. E., Reuland, D. S., Reddy, S. M., Nagle, M., Clark, S. D., Weber, R. P., Enyioha, C., Malo, T. L., Brenner, A. T., Armstrong, C., Coker-Schwimmer, M., Middleton, J. C., Voisin, C., & Harris, R. P. (2021). Screening for lung cancer with low-dose computed tomography: Updated evidence report and systematic review for the us preventive services task force. JAMA, 325(10), 971–987.
Kang, Q., Yao, S., Zhou, M., Zhang, K., & Abusorrah, A. (2021). Effective visual domain adaptation via generative adversarial distribution matching. IEEE Transactions Neural Networks and Learning Systems, 32(9), 3919–3929.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint[SPACE]arXiv:1412.6980.
Li, M., Wang, J., Chen, Y., Tang, Y., Wu, Z., Qi, Y., Jiang, H., Zheng, J., & Tsui, B. M. W. (2023). Low-dose CT image synthesis for domain adaptation imaging using a generative adversarial network with noise encoding transfer learning. IEEE Transactions on Medical Imaging, 42(9), 2616–2630.
Liao, H., Lin, W.-A., Zhou, S. K., & Luo, J. (2020). ADN: Artifact disentanglement network for unsupervised metal artifact reduction. IEEE Transactions on Medical Imaging, 39(3), 634–643.
Mitrovic, U., Pernus, F., Likar, B., & Špiclin, V. (2015). Simultaneous 3D–2D image registration and C-arm calibration: Application to endovascular image-guided interventions. Medical Physics, 42(11), 6433–6447.
Ouyang, J., Chen, K. T., Gong, E., Pauly, J., & Zaharchuk, G. (2019). Ultra-low-dose pet reconstruction using generative adversarial network with feature matching and task-specific perceptual loss. Medical Physics, 46(8), 3555–3564.
Pan, Y., Liu, M., Lian, C., & Xia, Y. (2020). Spatially-constrained Fisher representation for brain disease identification with incomplete multi-modal neuroimages. IEEE Transactions on Medical Imaging. https://doi.org/10.1109/TMI.2020.2983085
Pan, Y., Liu, M., Lian, C., Xia, Y., & Shen, D. (2020). Spatially-constrained Fisher representation for brain disease identification with incomplete multi-modal neuroimages. IEEE Transactions on Medical Imaging, 39, 2965–2975.
Pan, Y., Liu, M., Xia, Y., & Shen, D. (2022). Disease-image-specific learning for diagnosis-oriented neuroimage synthesis with incomplete multi-modality data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10), 6839–6853.
Pan, Y., & Xia, Y. (2021). Ultimate reconstruction: Understand your bones from orthogonal views. In 2021 IEEE 18th international symposium on biomedical imaging (ISBI) (pp. 1155–1158).
Park, S., Kim, S., Kwon, D., Jang, Y., Song, I.-S., & Baek, S.-H. (2023). Estimating 3D dental structures using simulated panoramic radiographs and neural ray tracing, arXiv:2304.04027.
Peng, C., Li, B., Liang, P., Zheng, J., Zhang, Y., Qiu, B., & Chen, D. Z. (2020). A cross-domain metal trace restoring network for reducing X-ray CT metal artifacts. IEEE Transactions on Medical Imaging, 39(12), 3831–3842.
Qiao, Y., Cui, J., Huang, F., Liu, H., Bao, C., & Li, X. (2021). Efficient style-corpus constrained learning for photorealistic style transfer. IEEE Transactions on Image Processing, 30, 3154–3166.
Qin, C., Schlemper, J., Caballero, J., Price, A. N., Hajnal, J. V., & Rueckert, D. (2019). Convolutional recurrent neural networks for dynamic MR image reconstruction. IEEE Transactions on Medical Imaging, 38(1), 280–290.
Quinto, E. T. (2006). An introduction to x-ray tomography and radon transforms. In Proceedings of Symposia in Applied Mathematics (Vol. 63, p. 1).
Rappard, J. R., Hummel, W. A., Jong, T., & Mouës, C. M. (2019). A comparison of image quality and radiation exposure between the mini c-arm and the standard c-arm. Hand, 14(6), 765–769.
Reaungamornrat, S., Sari, H., Catana, C., & Kamen, A. (2022). Multimodal image synthesis based on disentanglement representations of anatomical and modality specific features, learned using uncooperative relativistic gan. Medical Image Analysis, 80, 102514.
Ren, Z., Sidky, E. Y., Barber, R. F., Kao, C. -M., & Pan, X. (2023). Simultaneous activity and attenuation estimation in TOF-PET with TV-constrained nonconvex optimization. arXiv preprint[SPACE]arXiv:2303.17042.
Schousboe, J. T., & Ensrud, K. E. (2021). Opportunistic osteoporosis screening using low-dose computed tomography (LDCT): Promising strategy, but challenges remain. Journal of Bone and Mineral Research, 36(3), 425–426.
Skiles, M. D. (2019). First principles to further our understanding of what is to be done. PhD thesis, UCLA.
Song, W., Liang, Y., Yang, J., Wang, K., & He, L. (2021). Oral-3D: Reconstructing the 3D structure of oral cavity from panoramic X-ray. In AAAI conference on artificial intelligence.
Strobel, N., Meissner, O., Boese, J., Brunner, T., Heigl, B., Hoheisel, M., Lauritsch, G., Nagel, M., Pfister, M., Rührnschopf, E.-P., Scholz, B., Schreiber, B., Spahn, M., Zellerhoff, M., & Klingenbeck-Regn, K. (2009). 3D imaging with flat-detector C-arm systems. In M. F. Reiser, C. R. Becker, K. Nikolaou, & G. Glazer (Eds.), Multislice CT (pp. 33–51). Springer. https://doi.org/10.1007/978-3-540-33125-4_3
Tan, Z., Li, J. Y., Tao, H., Li, S., & Hu, Y. (2022). XctNet: Reconstruction network of volumetric images from a single X-ray image. Computerized Medical Imaging and Graphics, 98, 102067.
Tan, Z., Li, S., Hu, Y., Tao, H., & Zhang, L. (2023). Semi-xctnet: Volumetric images reconstruction network from a single projection image via semi-supervised learning. Computers in Biology and Medicine, 155, 106663.
Wandtke, J., & Hobbs, S. K. (2021). Low-dose chest CT to predict disease-free survival for early-stage node-negative centrally located lung adenocarcinoma. Radiology, 299(2), 448–449.
Wang, C.-L., Zhang, H., Zeng, Z.-Y., Yu, J.-H., & Wang, Y. (2021). Application of image reconstruction based on inverse radon transform in ct system parameter calibration and imaging. CompLex, 2021, 5360716–1536071610.
Wang, D., Cui, X., Chen, X., Zou, Z., Shi, T., Salcudean, S., Wang, Z. J., & Ward, R. (2021). Multi-view 3d reconstruction with transformers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 5722–5731).
Wang, Q., Gao, Q., Wu, L., Sun, G., & Jiao, L. (2021). Adversarial multi-path residual network for image super-resolution. IEEE Transactions on Image Processing, 30, 6648–6658.
Wang, T. -C., Liu, M. -Y., Zhu, J. -Y., Tao, A., Kautz, J., & Catanzaro, B. (2018). High-resolution image synthesis and semantic manipulation with conditional GANs. In 2018 IEEE/CVF conference on computer vision and pattern recognition (pp. 8798–8807).
Wardlaw, J. M., & White, P. M. (2000). The detection and management of unruptured intracranial aneurysms. Brain, 123(2), 205–221.
Wasserthal, J., Meyer, M., Breit, H.-C., Cyriac, J., Yang, S., & Segeroth, M. (2022). TotalSegmentator: Robust segmentation of 104 anatomical structures in CT images. arXiv:2208.05868.
Withers, P. J., Bouman, C., Carmignato, S., Cnudde, V., Grimaldi, D., Hagen, C. K., Maire, E., Manley, M., Plessis, A. D., & Stock, S. R. (2021). X-ray computed tomography. Nature Reviews Methods Primers, 1(1), 1–21.
Xiang, L., Qiao, Y., Nie, D., An, L., Lin, W., Wang, Q., & Shen, D. (2017). Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI. Neurocomputing, 267, 406–416.
Xie, Q., Zeng, D., Zhao, Q., Meng, D., Xu, Z., Liang, Z., & Ma, J. (2017). Robust low-dose CT sinogram preprocessing via exploiting noise-generating mechanism. IEEE Transactions on Medical Imaging, 36(12), 2487–2498.
Yang, Q., Yan, P., Zhang, Y., Yu, H., Shi, Y., Mou, X., Kalra, M. K., Zhang, Y., Sun, L., & Wang, G. (2018). Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss. IEEE Transactions on Medical Imaging, 37(6), 1348–1357.
Ying, X., Guo, H., Ma, K., Wu, J., Weng, Z., & Zheng, Y. (2019). X2CT-GAN: Reconstructing CT from biplanar x-rays with generative adversarial networks. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 10611–10620).
Zeng, D., Huang, J., Bian, Z., Niu, S., Zhang, H., Feng, Q., Liang, Z., & Ma, J. (2015). A simple low-dose X-ray CT simulation from high-dose scan. IEEE Transactions on Nuclear Science, 62(5), 2226–2233.
Zhan, F., Zhu, H., & Lu, S. (2019). Spatial fusion GAN for image synthesis. In 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 3653–3662).
Zhang, C., Dai, J., Wang, T., Liu, X., Chan, Y., Liu, L., He, W., Xie, Y., & Liang, X. (2023). Xtransct: Ultra-fast volumetric ct reconstruction using two orthogonal x-ray projections via a transformer network. arXiv:2305.19621.
Zhang, H., Wang, J., Li, N., Zhang, Y., Cui, J., Huo, L., & Zhang, H. (2023). A quantitative clinical evaluation of simultaneous reconstruction of attenuation and activity in time-of-flight PET. BMC Medical Imaging, 23(1), 35.
Zhang, Y., Fan, Q., Bao, F., Liu, Y., & Zhang, C. (2018). Single-image super-resolution based on rational fractal interpolation. IEEE Transactions on Image Processing, 27(8), 3782–3797.
Zhang, Y., & Yu, H. (2018). Convolutional neural network based metal artifact reduction in X-ray computed tomography. IEEE Transactions on Medical Imaging, 37(6), 1370–1381.
Zhang, Y., Zhang, Y., & Cai, W. (2020). A unified framework for generalizable style transfer: Style and content separation. IEEE Transactions on Image Processing, 29, 4085–4098.
Acknowledgements
This work was supported in part by National Natural Science Foundation of China (Nos. 6240012686, 62171377, 62131015, U23A20295), in part by the Fundamental Research Funds for the Central Universities (No. D5000230376), in part by China Postdoctoral Science Foundation (Nos. BX2021333, 2021M703340), in part by the National Key R &D Program of China under Grant 2022YFC2009903/2022YFC2009900, in part by the Ningbo Clinical Research Center for Medical Imaging (No. 2021L003: Open Project 2022LYKFZD06), and in part by Shenzhen Science and Technology Program (No. JCYJ20220530161616036).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by Ziyue Xu.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Mobile X-Ray Machines
Operating on the principles of traditional X-ray imaging, such devices emit X-rays (a form of electromagnetic radiation) through the patient’s body (Cretti, 2018). These X-rays are attenuated differently by various tissues, producing a shadow-like image on a detector (receiver) positioned opposite to the X-ray source (transmitter). The C-shaped arm allows for extensive mobility, including horizontal, vertical, and rotational movements around swivel axes, enabling X-ray imaging from multiple angles. Mobile C-arm machines can capture images from diverse directions by rotating the arm and the attached detector.
C-arm machines are typically categorized into three types: Mini C-arms, Compact C-arms, and Full-Size C-Arms, each varying in size and specialization (van Rappard et al., 2019). These machines are highly versatile, tailored to meet specific demands across various medical specialties. They are extensively utilized in operating rooms for orthopedics, trauma surgery, spinal procedures, and other disciplines, significantly enhancing the efficiency and effectiveness of intraoperative imaging. Table 6 lists 10 C-arm X-ray systems along with the references of their specific features and applications.
Patch Composing
During the testing phase, the input and output of our frameworks are patches consecutively cropped. To achieve the full-field of view (FoV) output, the consecutive outputs of each CT scan are composed into a single image. The spatial continuity and semantic coherence of the final stitched 3D volume are ensured through several key strategies in our methodology. To provide a clearer illustration of these processes, we present Fig. 12 to visually depicts how the spatial and semantic integrity are maintained through our patching and stitching methodology.
First, before neural modeling and generation, each 2D image patch extracted from the input X-ray images is tagged with its original location information. This tagging allows us to accurately place each generated patch back into its corresponding position in the output 3D volume, thus preserving spatial continuity.
Second, to maintain semantic coherence across the stitched volume, we employ an overlap-averaging strategy in regions where multiple patches overlap. This approach blends intensities in overlapping regions, ensuring smooth transitions and consistency in semantic features throughout the volume.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pan, Y., Ye, Y., Zhang, Y. et al. Draw Sketch, Draw Flesh: Whole-Body Computed Tomography from Any X-Ray Views. Int J Comput Vis 133, 2505–2526 (2025). https://doi.org/10.1007/s11263-024-02286-2
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s11263-024-02286-2