Self-supervised Moving Vehicle Tracking with Stereo Sound

25 October 2019

Chuang Gan

Hang Zhao

Peihao Chen

David D. Cox

Antonio Torralba

ArXiv PDF HTML

Papers citing "Self-supervised Moving Vehicle Tracking with Stereo Sound"

24 / 24 papers shown

Title
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness Yizhuo Yang Shenghai Yuan Muqing Cao Jianfei Yang Lihua Xie 164 7 0 11 Nov 2024
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions Jinzheng Zhao Yong-mei Xu Xinyuan Qian Davide Berghi Peipei Wu Meng Cui Jianyuan Sun Philip J. B. Jackson Wenwu Wang BDL 77 7 0 23 Oct 2023
Telling Left from Right: Learning Spatial Correspondence of Sight and Sound Karren D. Yang Bryan C. Russell Justin Salamon SSL 56 76 0 11 Jun 2020
Self-Supervised Audio-Visual Co-Segmentation Andrew Rouditchenko Hang Zhao Chuang Gan Josh H. McDermott Antonio Torralba VLM SSL 40 104 0 18 Apr 2019
The Sound of Motions Hang Zhao Chuang Gan Wei-Chiu Ma Antonio Torralba 53 252 0 11 Apr 2019
Variational Bayesian Inference for Audio-Visual Tracking of Multiple Speakers Yutong Ban Xavier Alameda-Pineda Laurent Girin Radu Horaud 38 50 0 28 Sep 2018
Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments Xiaofei Li Yutong Ban Laurent Girin Xavier Alameda-Pineda Radu Horaud 20 44 0 28 Sep 2018
Self-Supervised Generation of Spatial Audio for 360 Video Pedro Morgado Nuno Vasconcelos Timothy R. Langlois Oliver Wang MDE 44 171 0 07 Sep 2018
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild Samuel Albanie Arsha Nagrani Andrea Vedaldi Andrew Zisserman CVBM 53 271 0 16 Aug 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features Andrew Owens Alexei A. Efros SSL 74 747 0 10 Apr 2018
The Sound of Pixels Hang Zhao Chuang Gan Andrew Rouditchenko Carl Vondrick Josh H. McDermott Antonio Torralba VLM 63 535 0 09 Apr 2018
Learning to Separate Object Sounds by Watching Unlabeled Video Ruohan Gao Rogerio Feris Kristen Grauman SSL 47 284 0 05 Apr 2018
Learning to Localize Sound Source in Visual Scenes Arda Senocak Tae-Hyun Oh Junsik Kim Ming-Hsuan Yang In So Kweon SSL 57 344 0 10 Mar 2018
Objects that Sound Relja Arandjelović Andrew Zisserman ObjD VOS 72 529 0 18 Dec 2017
See, Hear, and Read: Deep Aligned Representations Y. Aytar Carl Vondrick Antonio Torralba VLM AI4TS 80 136 0 03 Jun 2017
Look, Listen and Learn Relja Arandjelović Andrew Zisserman SSL 82 900 0 23 May 2017
YOLO9000: Better, Faster, Stronger Joseph Redmon Ali Farhadi VLM ObjD 158 15,573 0 25 Dec 2016
SoundNet: Learning Sound Representations from Unlabeled Video Y. Aytar Carl Vondrick Antonio Torralba SSL 87 1,040 0 27 Oct 2016
Learning Aligned Cross-Modal Representations from Weakly Aligned Data Lluis Castrejon Y. Aytar Carl Vondrick Hamed Pirsiavash Antonio Torralba SSL DRL AI4TS 51 167 0 25 Jul 2016
Cross Modal Distillation for Supervision Transfer Saurabh Gupta Judy Hoffman Jitendra Malik 94 535 0 02 Jul 2015
You Only Look Once: Unified, Real-Time Object Detection Joseph Redmon S. Divvala Ross B. Girshick Ali Farhadi ObjD 568 36,643 0 08 Jun 2015
Distilling the Knowledge in a Neural Network Geoffrey E. Hinton Oriol Vinyals J. Dean FedML 238 19,523 0 09 Mar 2015
Do Deep Nets Really Need to be Deep? Lei Jimmy Ba R. Caruana 151 2,114 0 21 Dec 2013
Zero-Shot Learning Through Cross-Modal Transfer R. Socher M. Ganjoo Hamsa Sridhar Osbert Bastani Christopher D. Manning A. Ng BDL VLM 106 1,467 0 16 Jan 2013