Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.10075
Cited By
HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase Recognition
16 May 2024
Kun Yuan
V. Srivastav
Nassir Navab
N. Padoy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase Recognition"
29 / 29 papers shown
Title
Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model
Wei Li
Ming Hu
Guoan Wang
Lihao Liu
Kaijin Zhou
Junzhi Ning
Xin Guo
Zongyuan Ge
Lixu Gu
Junjun He
57
1
0
12 May 2025
Multimodal Graph Representation Learning for Robust Surgical Workflow Recognition with Adversarial Feature Disentanglement
Long Bai
Boyi Ma
Ruohan Wang
Guankun Wang
Beilei Cui
...
Mobarakol Islam
Zhe Min
Jiewen Lai
Nassir Navab
Hongliang Ren
79
0
0
03 May 2025
fine-CLIP: Enhancing Zero-Shot Fine-Grained Surgical Action Recognition with Vision-Language Models
Saurav Sharma
Didier Mutter
N. Padoy
VLM
MedIm
60
0
0
25 Mar 2025
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Chengan Che
Chao Wang
Tom Vercauteren
Sophia Tsoka
Luis C. Garcia-Peraza-Herrera
MedIm
68
1
0
25 Mar 2025
Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data
Jiajie Li
Brian R Quaranto
Chenhui Xu
Ishan Mishra
Ruiyang Qin
Dancheng Liu
Peter C W Kim
Jinjun Xiong
125
0
0
25 Jan 2025
OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining
Ming Hu
Kun Yuan
Yaling Shen
Feilong Tang
Xiaohao Xu
...
Jin Ye
N. Padoy
Nassir Navab
Junjun He
Zongyuan Ge
VLM
CLIP
134
12
0
23 Nov 2024
Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation
Kun Yuan
V. Srivastav
Nassir Navab
N. Padoy
97
9
0
30 Sep 2024
VidLPRO: A
V
i
d
‾
\underline{Vid}
Vi
d
eo-
L
‾
\underline{L}
L
anguage
P
‾
\underline{P}
P
re-training Framework for
R
o
‾
\underline{Ro}
R
o
botic and Laparoscopic Surgery
Mohammadmahdi Honarmand
Muhammad Abdullah Jamal
Omid Mohareri
106
2
0
07 Sep 2024
OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding
Ming Hu
Peng Xia
Lin Wang
Siyuan Yan
Feilong Tang
...
Xuelian Cheng
Jun Cheng
Chi Liu
Kaijing Zhou
Zongyuan Ge
58
20
0
11 Jun 2024
Challenges in Multi-centric Generalization: Phase and Step Recognition in Roux-en-Y Gastric Bypass Surgery
Joël L. Lavanchy
Sanat Ramesh
Diego DallÁlba
Cristians Gonzalez
Paolo Fiorini
Beat Muller-Stich
Philipp C. Nett
J. Marescaux
Didier Mutter
N. Padoy
48
17
0
18 Dec 2023
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Kun Yuan
V. Srivastav
Tong Yu
Joël L. Lavanchy
Pietro Mascagni
Pietro Mascagni
N. Padoy
Nicolas Padoy
86
23
0
27 Jul 2023
Foundation Model for Endoscopy Video Analysis via Large-scale Self-supervised Pre-train
Zhao Wang
Chang Liu
Shaoting Zhang
Qi Dou
MedIm
69
65
0
29 Jun 2023
Segment Everything Everywhere All at Once
Xueyan Zou
Jianwei Yang
Hao Zhang
Feng Li
Linjie Li
Jianfeng Wang
Lijuan Wang
Jianfeng Gao
Yong Jae Lee
MLLM
VLM
69
479
0
13 Apr 2023
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Wei Lin
Leonid Karlinsky
Nina Shvetsova
Horst Possegger
Mateusz Koziñski
Yikang Shen
Rogerio Feris
Hilde Kuehne
Horst Bischof
VLM
119
39
0
15 Mar 2023
Generalized Decoding for Pixel, Image, and Language
Xueyan Zou
Zi-Yi Dou
Jianwei Yang
Zhe Gan
Linjie Li
...
Lu Yuan
Nanyun Peng
Lijuan Wang
Yong Jae Lee
Jianfeng Gao
VLM
MLLM
ObjD
71
253
0
21 Dec 2022
AutoLaparo: A New Dataset of Integrated Multi-tasks for Image-guided Surgical Automation in Laparoscopic Hysterectomy
Ziyi Wang
Bo Lu
Yonghao Long
Fangxun Zhong
T. Cheung
Qi Dou
Yunhui Liu
53
61
0
03 Aug 2022
A Unified Sequence Interface for Vision Tasks
Ting-Li Chen
Saurabh Saxena
Lala Li
Nayeon Lee
David J. Fleet
Geoffrey E. Hinton
VLM
MLLM
56
150
0
15 Jun 2022
Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos
C. Nwoye
Tong Yu
Cristians Gonzalez
B. Seeliger
Pietro Mascagni
Didier Mutter
J. Marescaux
N. Padoy
60
139
0
07 Sep 2021
Multi-frame Collaboration for Effective Endoscopic Video Polyp Detection via Spatial-Temporal Feature Transformation
Lingyun Wu
Zhiqiang Hu
Yuanfeng Ji
Ping Luo
Shaoting Zhang
44
23
0
08 Jul 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
810
29,167
0
26 Feb 2021
TeCNO: Surgical Phase Recognition with Multi-Stage Temporal Convolutional Networks
Tobias Czempiel
Magdalini Paschali
Matthias Keicher
Walter Simson
H. Feußner
S. T. Kim
Nassir Navab
59
185
0
24 Mar 2020
End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Antoine Miech
Jean-Baptiste Alayrac
Lucas Smaira
Ivan Laptev
Josef Sivic
Andrew Zisserman
VGen
SSL
114
712
0
13 Dec 2019
ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission
Kexin Huang
Jaan Altosaar
Rajesh Ranganath
OOD
91
899
0
10 Apr 2019
Representation Learning with Contrastive Predictive Coding
Aaron van den Oord
Yazhe Li
Oriol Vinyals
DRL
SSL
282
10,253
0
10 Jul 2018
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
344
27,129
0
20 Mar 2017
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Patrick Murphy
Alan Yuille
SSeg
217
18,195
0
02 Jun 2016
EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos
A. P. Twinanda
S. Shehata
Didier Mutter
J. Marescaux
M. de Mathelin
N. Padoy
216
862
0
09 Feb 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.9K
193,426
0
10 Dec 2015
An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks
Ian Goodfellow
M. Berk Mirza
Xia Da
Aaron Courville
Yoshua Bengio
137
1,439
0
21 Dec 2013
1