A parallel network utilizing local features and global representations for segmentation of surgical instruments

Sun, Xinan; Zou, Yuelin; Wang, Shuxin; Su, He; Guan, Bo

doi:10.1007/s11548-022-02687-z

A parallel network utilizing local features and global representations for segmentation of surgical instruments

Original Article
Published: 10 June 2022

Volume 17, pages 1903–1913, (2022)
Cite this article

International Journal of Computer Assisted Radiology and Surgery Aims and scope Submit manuscript

Xinan Sun ORCID: orcid.org/0000-0002-3527-2781^1,2,
Yuelin Zou^1,2,
Shuxin Wang^1,2,
He Su ORCID: orcid.org/0000-0001-6337-1472^1,2 &
…
Bo Guan^1,2

971 Accesses
15 Citations
Explore all metrics

Abstract

Purpose

Automatic image segmentation of surgical instruments is a fundamental task in robot-assisted minimally invasive surgery, which greatly improves the context awareness of surgeons during the operation. A novel method based on Mask R-CNN is proposed in this paper to realize accurate instance segmentation of surgical instruments.

Methods

A novel feature extraction backbone is built, which could extract both local features through the convolutional neural network branch and global representations through the Swin-Transformer branch. Moreover, skip fusions are applied in the backbone to fuse both features and improve the generalization ability of the network.

Results

The proposed method is evaluated on the dataset of MICCAI 2017 EndoVis Challenge with three segmentation tasks and shows state-of-the-art performance with an mIoU of 0.5873 in type segmentation and 0.7408 in part segmentation. Furthermore, the results of ablation studies prove that the proposed novel backbone contributes to at least 17% improvement in mIoU.

Conclusion

The promising results demonstrate that our method can effectively extract global representations as well as local features in the segmentation of surgical instruments and improve the accuracy of segmentation. With the proposed novel backbone, the network can segment the contours of surgical instruments’ end tips more precisely. This method can provide more accurate data for localization and pose estimation of surgical instruments, and make a further contribution to the automation of robot-assisted minimally invasive surgery.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accurate instance segmentation of surgical instruments in robotic surgery: model refinement and cross-dataset evaluation

Article 25 June 2021

Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification

Article 31 August 2024

Mask then classify: multi-instance segmentation for surgical instruments

Article Open access 18 June 2021

Availability of data and material

The public dataset used during the current study is available from MICCAI2017 EndoVis Challenge (https://endovissub2017-roboticinstrumentsegmentation.grand-challenge.org/).

Code availability

Code will be publicly available with the publication of this work.

References

Sang H, Wang S, Li J, He C, La Z, Wang X (2011) Control design and implementation of a novel master–slave surgery robot system, MicroHand A. Int J Med Robot Comput Assist Surg 7(3):334–347
Article Google Scholar
Choi B, Jo K, Choi S, Choi J (2017) Surgical-tools detection based on Convolutional Neural Network in laparoscopic robot-assisted surgery. In: 2017 39th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 1756–1759. https://doi.org/10.1109/EMBC.2017.8037183
Caccianiga G, Mariani A, de Paratesi CG, Menciassi A, De Momi E (2021) Multi-sensory guidance and feedback for simulation-based training in robot assisted surgery: a preliminary comparison of visual, haptic, and visuo-haptic. IEEE Robot Autom Lett 6(2):3801–3808
Article Google Scholar
Trejo F, Hu Y (2018) User performance of VR-based dissection: direct mapping and motion coupling of a surgical tool. In: 2018 IEEE international conference on systems, man, and cybernetics (SMC), pp 3039–3044. https://doi.org/10.1109/SMC.2018.00516
Jo Y, Kim YJ, Moon H, Kim S (2018) Development of virtual reality-vision system in robot-assisted laparoscopic surgery. In: 2018 18th international conference on control, automation and systems (ICCAS), pp 1708–1712
Jin A, Yeung S, Jopling J, Krause J, Azagury D, Milstein A, Fei-Fei L (2018) Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp 691–699. https://doi.org/10.1109/WACV.2018.00081
Roberts DW, Strohbehn JW, Hatch JF, Murray W, Kettenberger H (1986) A frameless stereotaxic integration of computerized tomographic imaging and the operating microscope. J Neurosurg 65(4):545–549. https://doi.org/10.3171/jns.1986.65.4.0545
Article CAS PubMed Google Scholar
Heilbrun MP, McDonald P, Wiker C, Koehler S, Peters W (1992) Stereotactic localization and guidance using a machine vision technique. Stereotact Funct Neurosurg 58(1–4):94–98. https://doi.org/10.1159/000098979
Article CAS PubMed Google Scholar
Guo-Qing W, Arbter K, Hirzinger G (1997) Real-time visual servoing for laparoscopic surgery. Controlling robot motion with color image segmentation. IEEE Eng Med Biol Mag 16(1):40–45. https://doi.org/10.1109/51.566151
Article Google Scholar
Tonet O, Thoranaghatte RU, Megali G, Dario P (2007) Tracking endoscopic instruments without a localizer: a shape-analysis-based approach. Comput Aided Surg 12(1):35–42. https://doi.org/10.3109/10929080701210782
Article PubMed Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Comput Sci
Hasan SMK, Linte CA (2019) U-NetPlus: a modified encoder-decoder U-Net architecture for semantic and instance segmentation of surgical instruments from laparoscopic images. In: 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 7205–7211. https://doi.org/10.1109/EMBC.2019.8856791
Qin F, Li Y, Su YH, Xu D, Hannaford B (2019) Surgical instrument segmentation for endoscopic vision with data fusion of cnn prediction and kinematic pose. In: 2019 international conference on robotics and automation (ICRA), pp 9821–9827. https://doi.org/10.1109/ICRA.2019.8794122
Azqueta-Gavaldon I, Fröhlich FA, Strobl KH, Triebel R (2020) Segmentation of surgical instruments for minimally-invasive robot-assisted procedures using generative deep neural networks. https://arxiv.org/abs/2006.03486
Kurmann T, Márquez-Neila P, Allan M, Wolf S, Sznitman R (2021) Mask then classify: multi-instance segmentation for surgical instruments. Int J Comput Assist Radiol Surg. https://doi.org/10.1007/s11548-021-02404-2
Article PubMed PubMed Central Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Paper presented at the Proceedings of the 31st international conference on neural information processing systems, Long Beach, California, USA
Ni ZL, Bian GB, Hou ZG, Zhou XH, Xie XL, Li Z (2020) Attention-guided lightweight network for real-time segmentation of robotic surgical instruments. In: 2020 IEEE international conference on robotics and automation (ICRA), pp 9939–9945. https://doi.org/10.1109/ICRA40945.2020.9197425
Zhou X, Guo Y, He W, Song H (2021) Hierarchical attentional feature fusion for surgical instrument segmentation. In: 2021 43rd annual international conference of the IEEE engineering in medicine & biology society (EMBC), pp 3061–3065. https://doi.org/10.1109/EMBC46164.2021.9630553
Forte M-P, Gourishetti R, Javot B, Engler T, Gomez ED, Kuchenbecker KJ (2022) Design of interactive augmented reality functions for robotic surgery and evaluation in dry-lab lymphadenectomy. Int J Med Robot Comput Assist Surg 18(2):e2351. https://doi.org/10.1002/rcs.2351
Article Google Scholar
Qian L, Wu JY, DiMaio SP, Navab N, Kazanzides P (2020) A review of augmented reality in robotic-assisted surgery. IEEE Trans Med Robot Bion 2(1):1–16. https://doi.org/10.1109/TMRB.2019.2957061
Article Google Scholar
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. Paper presented at the CVPR 2021
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. Paper presented at the ICLR2021
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Allan M, Shvets A, Kurmann T, Zhang Z, Duggal R, Su Y-H, Rieke N, Laina I, Kalavakonda N, Bodenstedt S (2019) 2017 robotic instrument segmentation challenge. https://arxiv.org/abs/1902.06426
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Oksuz K, Cam BC, Akbas E, Kalkan S (2021) Rank & sort loss for object detection and instance segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3009–3018
Milletari F, Navab N, Ahmadi S (2016) V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 fourth international conference on 3D vision (3DV), pp 565–571. https://doi.org/10.1109/3DV.2016.79
Shvets AA, Rakhlin A, Kalinin AA, Iglovikov VI (2018) Automatic instrument segmentation in robot-assisted surgery using deep learning. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), pp 624–628. https://doi.org/10.1109/ICMLA.2018.00100
González C, Bravo-Sánchez L, Arbelaez P (2020) Isinet: an instance-based approach for surgical instrument segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 595–605
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241
Jin Y, Cheng K, Dou Q, Heng P-A (2019) Incorporating temporal prior from motion flow for instrument segmentation in minimally invasive surgery video. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 440–448
Kong X, Jin Y, Dou Q, Wang Z, Wang Z, Lu B, Dong E, Liu Y-H, Sun D (2021) Accurate instance segmentation of surgical instruments in robotic surgery: model refinement and cross-dataset evaluation. Int J Comput Assist Radiol Surg 16(9):1607–1614. https://doi.org/10.1007/s11548-021-02438-6
Article PubMed Google Scholar
Han K, Wang Y, Chen H, Chen X, Guo J, Liu Z, Tang Y, Xiao A, Xu C, Xu Y, Yang Z, Zhang Y, Tao D (2022) A survey on vision transformer. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3152247
Article PubMed Google Scholar

Download references

Acknowledgements

Thanks are to Tao Liang and Mengjie Chen for assistance with the experiments and to Ziqi Liu for valuable discussion. Thanks for Yifei Li's help in polishing the manuscript.

Funding

This study was funded by the National Natural Science Foundation of China (Grant No. 52175028).

Author information

Authors and Affiliations

Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, 135 Yaguan Road, Tianjin, 300350, China
Xinan Sun, Yuelin Zou, Shuxin Wang, He Su & Bo Guan
School of Mechanical Engineering, Tianjin University, 135 Yaguan Road, Jinnan District, Tianjin, 300350, China
Xinan Sun, Yuelin Zou, Shuxin Wang, He Su & Bo Guan

Authors

Xinan Sun
View author publications
Search author on:PubMed Google Scholar
Yuelin Zou
View author publications
Search author on:PubMed Google Scholar
Shuxin Wang
View author publications
Search author on:PubMed Google Scholar
He Su
View author publications
Search author on:PubMed Google Scholar
Bo Guan
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to He Su.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, X., Zou, Y., Wang, S. et al. A parallel network utilizing local features and global representations for segmentation of surgical instruments. Int J CARS 17, 1903–1913 (2022). https://doi.org/10.1007/s11548-022-02687-z

Download citation

Received: 07 December 2021
Accepted: 19 May 2022
Published: 10 June 2022
Version of record: 10 June 2022
Issue date: October 2022
DOI: https://doi.org/10.1007/s11548-022-02687-z

Keywords

Profiles

He Su View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A parallel network utilizing local features and global representations for segmentation of surgical instruments

Abstract

Purpose

Methods

Results

Conclusion

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Accurate instance segmentation of surgical instruments in robotic surgery: model refinement and cross-dataset evaluation

Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification

Mask then classify: multi-instance segmentation for surgical instruments

Explore related subjects

Availability of data and material

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now