Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.08168
Cited By
Look, Listen and Learn
23 May 2017
Relja Arandjelović
Andrew Zisserman
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Look, Listen and Learn"
38 / 238 papers shown
Title
Self-labelling via simultaneous clustering and representation learning
Yuki M. Asano
Christian Rupprecht
Andrea Vedaldi
SSL
42
761
0
13 Nov 2019
Vision-Infused Deep Audio Inpainting
Hang Zhou
Ziwei Liu
Lingfeng Guo
Ping Luo
Dahua Lin
35
88
0
24 Oct 2019
Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zeroshot Classification and Retrieval of Videos
Kranti K. Parida
Neeraj Matiyali
T. Guha
Gaurav Sharma
VLM
32
41
0
19 Oct 2019
Deep Latent Space Learning for Cross-modal Mapping of Audio and Visual Signals
Shah Nawaz
Muhammad Kamran Janjua
I. Gallo
Arif Mahmood
Alessandro Calefati
14
32
0
18 Sep 2019
Recursive Visual Sound Separation Using Minus-Plus Net
Xudong Xu
Bo Dai
Dahua Lin
35
91
0
30 Aug 2019
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition
Evangelos Kazakos
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
16
332
0
22 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
105
3,630
0
06 Aug 2019
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
36
387
0
31 Jul 2019
Learning Soft-Attention Models for Tempo-invariant Audio-Sheet Music Retrieval
S. Balke
Matthias Dorfer
Luis Carvalho
A. Arzt
Gerhard Widmer
19
11
0
26 Jun 2019
Evolving Losses for Unlabeled Video Representation Learning
A. Piergiovanni
A. Angelova
Michael S. Ryoo
SSL
11
7
0
07 Jun 2019
Learning Representations by Maximizing Mutual Information Across Views
Philip Bachman
R. Devon Hjelm
William Buchwalter
SSL
72
1,457
0
03 Jun 2019
How Much Does Audio Matter to Recognize Egocentric Object Interactions?
Alejandro Cartas
Jordi Luque
Petia Radeva
Carlos Segura
Mariella Dimiccoli
EgoV
17
6
0
03 Jun 2019
What Makes Training Multi-Modal Classification Networks Hard?
Weiyao Wang
Du Tran
Matt Feiszli
28
442
0
29 May 2019
Data-Efficient Image Recognition with Contrastive Predictive Coding
Olivier J. Hénaff
A. Srinivas
J. Fauw
Ali Razavi
Carl Doersch
S. M. Ali Eslami
Aaron van den Oord
SSL
58
1,417
0
22 May 2019
Machine learning in acoustics: theory and applications
Michael J. Bianco
Peter Gerstoft
James Traer
Emma Ozanich
M. Roch
Sharon Gannot
Charles-Alban Deledalle
AI4CE
28
376
0
11 May 2019
Scaling and Benchmarking Self-Supervised Visual Representation Learning
Priya Goyal
D. Mahajan
Abhinav Gupta
Ishan Misra
SSL
24
396
0
03 May 2019
Audio-Visual Model Distillation Using Acoustic Images
Andrés F. Pérez
Valentina Sanguineti
Pietro Morerio
Vittorio Murino
VLM
15
27
0
16 Apr 2019
The Sound of Motions
Hang Zhao
Chuang Gan
Wei-Chiu Ma
Antonio Torralba
17
251
0
11 Apr 2019
2.5D Visual Sound
Ruohan Gao
Kristen Grauman
VGen
27
130
0
11 Dec 2018
Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features
S. Palazzo
C. Spampinato
I. Kavasidis
D. Giordano
Joseph Schmidt
M. Shah
127
111
0
25 Oct 2018
Scattering Networks for Hybrid Representation Learning
Edouard Oyallon
Sergey Zagoruyko
Gabriel Huang
N. Komodakis
Simon Lacoste-Julien
Matthew Blaschko
Eugene Belilovsky
21
84
0
17 Sep 2018
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
Samuel Albanie
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
CVBM
30
270
0
16 Aug 2018
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation
Hang Zhou
Yu Liu
Ziwei Liu
Ping Luo
Xiaogang Wang
CVBM
31
436
0
20 Jul 2018
Spatio-Temporal Channel Correlation Networks for Action Classification
Ali Diba
Mohsen Fayyaz
Vivek Sharma
M. M. Arzani
Rahman Yousefzadeh
Juergen Gall
Luc Van Gool
3DPC
26
181
0
19 Jun 2018
Playing hard exploration games by watching YouTube
Y. Aytar
Tobias Pfaff
David Budden
T. Paine
Ziyun Wang
Nando de Freitas
35
269
0
29 May 2018
Weakly-supervised Visual Instrument-playing Action Detection in Videos
Jen-Yu Liu
Yi-Hsuan Yang
Shyh-Kang Jeng
21
13
0
05 May 2018
Learnable PINs: Cross-Modal Embeddings for Person Identity
Arsha Nagrani
Samuel Albanie
Andrew Zisserman
SSL
41
140
0
02 May 2018
Randomly weighted CNNs for (music) audio classification
Jordi Pons
Xavier Serra
19
85
0
01 May 2018
Adaptive pooling operators for weakly labeled sound event detection
Brian McFee
Justin Salamon
J. P. Bello
27
148
0
26 Apr 2018
Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events
Sanjeel Parekh
S. Essid
A. Ozerov
Ngoc Q. K. Duong
P. Pérez
G. Richard
SSL
8
19
0
19 Apr 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Andrew Owens
Alexei A. Efros
SSL
51
745
0
10 Apr 2018
The Sound of Pixels
Hang Zhao
Chuang Gan
Andrew Rouditchenko
Carl Vondrick
Josh H. McDermott
Antonio Torralba
VLM
22
529
0
09 Apr 2018
Learning a Text-Video Embedding from Incomplete and Heterogeneous Data
Antoine Miech
Ivan Laptev
Josef Sivic
22
233
0
07 Apr 2018
Seeing Voices and Hearing Faces: Cross-modal biometric matching
Arsha Nagrani
Samuel Albanie
Andrew Zisserman
CVBM
22
219
0
01 Apr 2018
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
36
426
0
23 Mar 2018
Moments in Time Dataset: one million videos for event understanding
Mathew Monfort
A. Andonian
Bolei Zhou
K. Ramakrishnan
Sarah Adel Bargal
...
L. Brown
Quanfu Fan
Dan Gutfreund
Carl Vondrick
A. Oliva
47
538
0
09 Jan 2018
Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning
Andrew Owens
Jiajun Wu
Josh H. McDermott
William T. Freeman
Antonio Torralba
SSL
41
177
0
20 Dec 2017
Objects that Sound
Relja Arandjelović
Andrew Zisserman
ObjD
VOS
44
528
0
18 Dec 2017
Previous
1
2
3
4
5