Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.11760
Cited By
Self-supervised Moving Vehicle Tracking with Stereo Sound
25 October 2019
Chuang Gan
Hang Zhao
Peihao Chen
David D. Cox
Antonio Torralba
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-supervised Moving Vehicle Tracking with Stereo Sound"
24 / 24 papers shown
Title
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
Yizhuo Yang
Shenghai Yuan
Muqing Cao
Jianfei Yang
Lihua Xie
164
7
0
11 Nov 2024
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
Jinzheng Zhao
Yong-mei Xu
Xinyuan Qian
Davide Berghi
Peipei Wu
Meng Cui
Jianyuan Sun
Philip J. B. Jackson
Wenwu Wang
BDL
77
7
0
23 Oct 2023
Telling Left from Right: Learning Spatial Correspondence of Sight and Sound
Karren D. Yang
Bryan C. Russell
Justin Salamon
SSL
56
76
0
11 Jun 2020
Self-Supervised Audio-Visual Co-Segmentation
Andrew Rouditchenko
Hang Zhao
Chuang Gan
Josh H. McDermott
Antonio Torralba
VLM
SSL
40
104
0
18 Apr 2019
The Sound of Motions
Hang Zhao
Chuang Gan
Wei-Chiu Ma
Antonio Torralba
53
252
0
11 Apr 2019
Variational Bayesian Inference for Audio-Visual Tracking of Multiple Speakers
Yutong Ban
Xavier Alameda-Pineda
Laurent Girin
Radu Horaud
38
50
0
28 Sep 2018
Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments
Xiaofei Li
Yutong Ban
Laurent Girin
Xavier Alameda-Pineda
Radu Horaud
20
44
0
28 Sep 2018
Self-Supervised Generation of Spatial Audio for 360 Video
Pedro Morgado
Nuno Vasconcelos
Timothy R. Langlois
Oliver Wang
MDE
44
171
0
07 Sep 2018
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
Samuel Albanie
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
CVBM
53
271
0
16 Aug 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Andrew Owens
Alexei A. Efros
SSL
74
747
0
10 Apr 2018
The Sound of Pixels
Hang Zhao
Chuang Gan
Andrew Rouditchenko
Carl Vondrick
Josh H. McDermott
Antonio Torralba
VLM
63
535
0
09 Apr 2018
Learning to Separate Object Sounds by Watching Unlabeled Video
Ruohan Gao
Rogerio Feris
Kristen Grauman
SSL
47
284
0
05 Apr 2018
Learning to Localize Sound Source in Visual Scenes
Arda Senocak
Tae-Hyun Oh
Junsik Kim
Ming-Hsuan Yang
In So Kweon
SSL
57
344
0
10 Mar 2018
Objects that Sound
Relja Arandjelović
Andrew Zisserman
ObjD
VOS
72
529
0
18 Dec 2017
See, Hear, and Read: Deep Aligned Representations
Y. Aytar
Carl Vondrick
Antonio Torralba
VLM
AI4TS
80
136
0
03 Jun 2017
Look, Listen and Learn
Relja Arandjelović
Andrew Zisserman
SSL
82
900
0
23 May 2017
YOLO9000: Better, Faster, Stronger
Joseph Redmon
Ali Farhadi
VLM
ObjD
158
15,573
0
25 Dec 2016
SoundNet: Learning Sound Representations from Unlabeled Video
Y. Aytar
Carl Vondrick
Antonio Torralba
SSL
87
1,040
0
27 Oct 2016
Learning Aligned Cross-Modal Representations from Weakly Aligned Data
Lluis Castrejon
Y. Aytar
Carl Vondrick
Hamed Pirsiavash
Antonio Torralba
SSL
DRL
AI4TS
51
167
0
25 Jul 2016
Cross Modal Distillation for Supervision Transfer
Saurabh Gupta
Judy Hoffman
Jitendra Malik
94
535
0
02 Jul 2015
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
568
36,643
0
08 Jun 2015
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
238
19,523
0
09 Mar 2015
Do Deep Nets Really Need to be Deep?
Lei Jimmy Ba
R. Caruana
151
2,114
0
21 Dec 2013
Zero-Shot Learning Through Cross-Modal Transfer
R. Socher
M. Ganjoo
Hamsa Sridhar
Osbert Bastani
Christopher D. Manning
A. Ng
BDL
VLM
106
1,467
0
16 Jan 2013
1