Purpose
Real-time surgical phase anticipation is crucial for improving operating room efficiency and patient safety. While existing recognition approaches excel at identifying current surgical phases, they provide limited foresight into future procedural steps, restricting their intraoperative utility. Similarly, current anticipation methods are constrained to predicting short-term events or singular occurrences, neglecting the dynamic and sequential nature of surgical workflows. To address these limitations, we propose SWAG (Surgical Workflow Anticipative Generator), a unified framework for phase recognition and long-term anticipation of surgical workflows.
Methods
SWAG employs two generative decoding methods—single-pass (SP) and auto-regressive (AR)—to predict sequences of future surgical phases. A novel prior knowledge embedding mechanism enhances the accuracy of anticipatory predictions by integrating statistical priors into token initialisation. The framework addresses future phase classification and remaining time regression tasks. Additionally, a regression-to-classification (R2C) method is introduced to map continuous predictions to discrete temporal segments. SWAG's performance was evaluated on the Cholec80 and AutoLaparo21 datasets.
Results
The single-pass classification model with prior knowledge embeddings (SWAG-SP*) achieved 53.5% accuracy in 15-minute anticipation on AutoLaparo21, while the R2C model reached 60.8% accuracy on Cholec80. SWAG's single-pass regression approach outperformed existing methods for remaining time prediction, achieving weighted mean absolute errors of 0.32 and 0.48 minutes for 2- and 3-minute horizons, respectively.
Conclusion
SWAG demonstrates versatility across classification and regression tasks, offering robust tools for real-time surgical workflow anticipation. By unifying recognition and anticipatory capabilities, SWAG provides actionable predictions to enhance intraoperative decision-making. The project webpage is available at https://maxboels.github.io/swag.