AUTHOR=Deol Ekamjit S. , Tollefson Matthew K. , Antolin Alenka , Zohar Maya , Bar Omri , Ben-Ayoun Danielle , Mynderse Lance A. , Lomas Derek J. , Avant Ross A. , Miller Adam R. , Elliott Daniel S. , Boorjian Stephen A. , Wolf Tamir , Asselmann Dotan , Khanna Abhinav TITLE=Automated surgical step recognition in transurethral bladder tumor resection using artificial intelligence: transfer learning across surgical modalities JOURNAL=Frontiers in Artificial Intelligence VOLUME=7 YEAR=2024 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2024.1375482 DOI=10.3389/frai.2024.1375482 ISSN=2624-8212 ABSTRACT=Objective

Automated surgical step recognition (SSR) using AI has been a catalyst in the “digitization” of surgery. However, progress has been limited to laparoscopy, with relatively few SSR tools in endoscopic surgery. This study aimed to create a SSR model for transurethral resection of bladder tumors (TURBT), leveraging a novel application of transfer learning to reduce video dataset requirements.

Materials and methods

Retrospective surgical videos of TURBT were manually annotated with the following steps of surgery: primary endoscopic evaluation, resection of bladder tumor, and surface coagulation. Manually annotated videos were then utilized to train a novel AI computer vision algorithm to perform automated video annotation of TURBT surgical video, utilizing a transfer-learning technique to pre-train on laparoscopic procedures. Accuracy of AI SSR was determined by comparison to human annotations as the reference standard.

Results

A total of 300 full-length TURBT videos (median 23.96 min; IQR 14.13–41.31 min) were manually annotated with sequential steps of surgery. One hundred and seventy-nine videos served as a training dataset for algorithm development, 44 for internal validation, and 77 as a separate test cohort for evaluating algorithm accuracy. Overall accuracy of AI video analysis was 89.6%. Model accuracy was highest for the primary endoscopic evaluation step (98.2%) and lowest for the surface coagulation step (82.7%).

Conclusion

We developed a fully automated computer vision algorithm for high-accuracy annotation of TURBT surgical videos. This represents the first application of transfer-learning from laparoscopy-based computer vision models into surgical endoscopy, demonstrating the promise of this approach in adapting to new procedure types.