V-SlowFast Network for Efficient Visual Sound Separation

V-SlowFast Network for Efficient Visual Sound Separation

18 September 2021

Esa Rahtu

Papers citing "V-SlowFast Network for Efficient Visual Sound Separation"

6 / 6 papers shown

Title
Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations Minoh Jeong Min Namgung Zae Myung Kim Dongyeop Kang Yao-Yi Chiang Alfred Hero 25 0 0 02 Oct 2024
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models David Kurzendörfer Otniel-Bogdan Mercea A. Sophia Koepke Zeynep Akata VLM CLIP 33 2 0 09 Apr 2024
Self-supervised Co-training for Video Representation Learning Tengda Han Weidi Xie Andrew Zisserman SSL 215 309 0 19 Oct 2020
Video Representation Learning by Recognizing Temporal Transformations Simon Jenni Givi Meishvili Paolo Favaro 131 133 0 21 Jul 2020
VisualEchoes: Spatial Image Representation Learning through Echolocation Ruohan Gao Changan Chen Ziad Al-Halah Carl Schissler Kristen Grauman MDE SSL 171 83 0 04 May 2020
Audiovisual SlowFast Networks for Video Recognition Fanyi Xiao Yong Jae Lee Kristen Grauman Jitendra Malik Christoph Feichtenhofer 197 206 0 23 Jan 2020