Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1512.08512
Cited By
Visually Indicated Sounds
28 December 2015
Andrew Owens
Phillip Isola
Josh H. McDermott
Antonio Torralba
Edward H. Adelson
William T. Freeman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visually Indicated Sounds"
50 / 206 papers shown
Title
Machine learning in acoustics: theory and applications
Michael J. Bianco
Peter Gerstoft
James Traer
Emma Ozanich
M. Roch
Sharon Gannot
Charles-Alban Deledalle
AI4CE
33
376
0
11 May 2019
A critical analysis of self-supervision, or what we can learn from a single image
Yuki M. Asano
Christian Rupprecht
Andrea Vedaldi
SSL
27
145
0
30 Apr 2019
Listen to the Image
Di Hu
Dong Wang
Xuelong Li
Feiping Nie
Qi. Wang
19
17
0
19 Apr 2019
Audio-Visual Model Distillation Using Acoustic Images
Andrés F. Pérez
Valentina Sanguineti
Pietro Morerio
Vittorio Murino
VLM
15
27
0
16 Apr 2019
Co-Separating Sounds of Visual Objects
Ruohan Gao
Kristen Grauman
33
206
0
16 Apr 2019
Bounce and Learn: Modeling Scene Dynamics with Real-World Bounces
Senthil Purushwalkam
Abhinav Gupta
D. Kaufman
Bryan C. Russell
3DH
SSL
22
20
0
15 Apr 2019
VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun
Austin Myers
Carl Vondrick
Kevin Patrick Murphy
Cordelia Schmid
VLM
SSL
8
1,233
0
03 Apr 2019
Dual-modality seq2seq network for audio-visual event localization
Yan-Bo Lin
Yu-Jhe Li
Y. Wang
32
127
0
20 Feb 2019
2.5D Visual Sound
Ruohan Gao
Kristen Grauman
VGen
27
130
0
11 Dec 2018
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning
Mitchell Wortsman
Kiana Ehsani
Mohammad Rastegari
Ali Farhadi
Roozbeh Mottaghi
SSL
25
222
0
03 Dec 2018
Cogni-Net: Cognitive Feature Learning through Deep Visual Perception
Pranay Mukherjee
Abhirup Das
A. Bhunia
P. Roy
16
11
0
01 Nov 2018
Self-Supervised Generation of Spatial Audio for 360 Video
Pedro Morgado
Nuno Vasconcelos
Timothy R. Langlois
Oliver Wang
MDE
24
171
0
07 Sep 2018
Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition
Unaiza Ahsan
Rishi Madhok
Irfan Essa
SSL
16
106
0
22 Aug 2018
Small Sample Learning in Big Data Era
Jun Shu
Zongben Xu
Deyu Meng
31
71
0
14 Aug 2018
Towards Audio to Scene Image Synthesis using Generative Adversarial Network
Chia-Hung Wan
Shun-Po Chuang
Hung-yi Lee
GAN
33
61
0
13 Aug 2018
A Study of Material Sonification in Touchscreen Devices
Rodrigo Martín
Michael Weinmann
M. Hullin
18
1
0
03 Jul 2018
Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed
Yaman Kumar Singla
Mayank Aggarwal
Pratham Nawal
Shiníchi Satoh
R. Shah
Roger Zimmermann
25
26
0
02 Jul 2018
A Recurrent Convolutional Neural Network Approach for Sensorless Force Estimation in Robotic Surgery
Arturo Marbán
Vignesh Srinivasan
Wojciech Samek
Josep Fernández
A. Casals
30
84
0
22 May 2018
On Learning Associations of Faces and Voices
Changil Kim
Hijung Valentina Shin
Tae-Hyun Oh
Alexandre Kaspar
Mohamed A. Elgharib
Wojciech Matusik
CVBM
19
83
0
15 May 2018
Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events
Sanjeel Parekh
S. Essid
A. Ozerov
Ngoc Q. K. Duong
P. Pérez
G. Richard
SSL
16
19
0
19 Apr 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Andrew Owens
Alexei A. Efros
SSL
51
745
0
10 Apr 2018
The Sound of Pixels
Hang Zhao
Chuang Gan
Andrew Rouditchenko
Carl Vondrick
Josh H. McDermott
Antonio Torralba
VLM
22
529
0
09 Apr 2018
Learning to Separate Object Sounds by Watching Unlabeled Video
Ruohan Gao
Rogerio Feris
Kristen Grauman
SSL
18
284
0
05 Apr 2018
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input
David Harwath
Adrià Recasens
Dídac Surís
Galen Chuang
Antonio Torralba
James R. Glass
32
200
0
04 Apr 2018
Who Let The Dogs Out? Modeling Dog Behavior From Visual Data
Kiana Ehsani
Hessam Bagherinezhad
Joseph Redmon
Roozbeh Mottaghi
Ali Farhadi
VGen
27
59
0
28 Mar 2018
Lip Movements Generation at a Glance
Lele Chen
Zhiheng Li
R. Maddox
Z. Duan
Chenliang Xu
25
261
0
28 Mar 2018
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
53
426
0
23 Mar 2018
Learning to Localize Sound Source in Visual Scenes
Arda Senocak
Tae-Hyun Oh
Junsik Kim
Ming-Hsuan Yang
In So Kweon
SSL
22
343
0
10 Mar 2018
ViTac: Feature Sharing between Vision and Tactile Sensing for Cloth Texture Recognition
Shan Luo
Wenzhen Yuan
Edward H. Adelson
Anthony G. Cohn
R. Fuentes
19
132
0
21 Feb 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
EGVM
46
11,462
0
11 Jan 2018
A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging
Juncheng Billy Li
Yun Wang
Joseph Szurley
Florian Metze
Samarjit Das
24
2
0
27 Dec 2017
Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning
Andrew Owens
Jiajun Wu
Josh H. McDermott
William T. Freeman
Antonio Torralba
SSL
41
177
0
20 Dec 2017
Audio to Body Dynamics
Eli Shlizerman
Lucio Dery
Hayden Schoen
Ira Kemelmacher-Shlizerman
VGen
48
154
0
19 Dec 2017
Objects that Sound
Relja Arandjelović
Andrew Zisserman
ObjD
VOS
44
528
0
18 Dec 2017
Visual to Sound: Generating Natural Sound for Videos in the Wild
Yipin Zhou
Zhaowen Wang
Chen Fang
Trung Bui
Tamara L. Berg
VGen
25
206
0
04 Dec 2017
Visual Speech Enhancement
Aviv Gabbay
Asaph Shamir
Shmuel Peleg
31
16
0
23 Nov 2017
Lip2AudSpec: Speech reconstruction from silent lip movements video
Hassan Akbari
Himani Arora
Liangliang Cao
N. Mesgarani
32
86
0
26 Oct 2017
Seeing Through Noise: Visually Driven Speaker Separation and Enhancement
Aviv Gabbay
Ariel Ephrat
Tavi Halperin
Shmuel Peleg
42
19
0
22 Aug 2017
Improved Speech Reconstruction from Silent Video
Ariel Ephrat
Tavi Halperin
Shmuel Peleg
37
89
0
01 Aug 2017
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
Hirokatsu Kataoka
Soma Shirakabe
Yun He
S. Ueta
Teppei Suzuki
...
Ryousuke Takasawa
Masataka Fuchida
Yudai Miyashita
Kazushige Okayasu
Yuta Matsuzaki
30
1
0
20 Jul 2017
See, Hear, and Read: Deep Aligned Representations
Y. Aytar
Carl Vondrick
Antonio Torralba
VLM
AI4TS
17
136
0
03 Jun 2017
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
15
2,868
0
26 May 2017
Look, Listen and Learn
Relja Arandjelović
Andrew Zisserman
SSL
42
895
0
23 May 2017
You said that?
Joon Son Chung
A. Jamaludin
Andrew Zisserman
CVBM
23
258
0
08 May 2017
Deep Cross-Modal Audio-Visual Generation
Lele Chen
Sudhanshu Srivastava
Z. Duan
Chenliang Xu
33
221
0
26 Apr 2017
Time-Contrastive Networks: Self-Supervised Learning from Video
P. Sermanet
Corey Lynch
Yevgen Chebotar
Jasmine Hsu
Eric Jang
S. Schaal
Sergey Levine
SSL
56
814
0
23 Apr 2017
Connecting Look and Feel: Associating the visual and tactile properties of physical materials
Wenzhen Yuan
Shaoxiong Wang
Siyuan Dong
Edward H. Adelson
33
122
0
12 Apr 2017
Vid2speech: Speech Reconstruction from Silent Video
Ariel Ephrat
Shmuel Peleg
38
119
0
02 Jan 2017
Image-to-Image Translation with Conditional Adversarial Networks
Phillip Isola
Jun-Yan Zhu
Tinghui Zhou
Alexei A. Efros
SSeg
212
19,494
0
21 Nov 2016
Learning to Perform Physics Experiments via Deep Reinforcement Learning
Misha Denil
Pulkit Agrawal
Tejas D. Kulkarni
Tom Erez
Peter W. Battaglia
Nando de Freitas
AI4CE
46
339
0
06 Nov 2016
Previous
1
2
3
4
5
Next