Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.13583
Cited By
Learning State-Aware Visual Representations from Audible Interactions
27 September 2022
Himangi Mittal
Pedro Morgado
Unnat Jain
Abhinav Gupta
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning State-Aware Visual Representations from Audible Interactions"
24 / 24 papers shown
Title
ANAVI: Audio Noise Awareness using Visuals of Indoor environments for NAVIgation
Vidhi Jain
Rishi Veerapaneni
Yonatan Bisk
34
0
0
24 Oct 2024
T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data
Hugo Thimonier
José Lucas De Melo Costa
Fabrice Popineau
Arpad Rimmel
Bich-Liên Doan
53
1
0
07 Oct 2024
Measuring Sound Symbolism in Audio-visual Models
Wei-Cheng Tseng
Yi-Jen Shih
David Harwath
Raymond Mooney
34
0
0
18 Sep 2024
Self-supervised visual learning from interactions with objects
A. Aubret
Céline Teulière
Jochen Triesch
55
5
0
09 Jul 2024
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen
Puyuan Peng
Ami Baid
Zihui Xue
Wei-Ning Hsu
David Harwath
Kristen Grauman
VGen
42
7
0
13 Jun 2024
Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models
Himangi Mittal
Nakul Agarwal
Shao-Yuan Lo
Kwonjoon Lee
41
14
0
30 May 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
50
9
0
20 May 2024
SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Changan Chen
Kumar Ashutosh
Rohit Girdhar
David Harwath
Kristen Grauman
EgoV
SSL
28
6
0
08 Apr 2024
A SOUND APPROACH: Using Large Language Models to generate audio descriptions for egocentric text-audio retrieval
Andreea-Maria Oncescu
João F. Henriques
Andrew Zisserman
Samuel Albanie
A. Sophia Koepke
26
5
0
29 Feb 2024
HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition
Guoying Zhao
Zheng Lian
Bin Liu
Jianhua Tao
53
29
0
11 Jan 2024
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Wenqi Jia
Miao Liu
Hao Jiang
Ishwarya Ananthabhotla
James M. Rehg
V. Ithapu
Ruohan Gao
EgoV
23
6
0
20 Dec 2023
Separating Invisible Sounds Toward Universal Audiovisual Scene-Aware Sound Separation
Yiyang Su
A. Vosoughi
Shijian Deng
Yapeng Tian
Chenliang Xu
26
4
0
18 Oct 2023
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Yuan Tseng
Layne Berry
Yi-Ting Chen
I-Hsiang Chiu
Hsuan-Hao Lin
...
Yu Tsao
Shinji Watanabe
Abdel-rahman Mohamed
Chi-Luen Feng
Hung-yi Lee
VLM
SSL
55
14
0
19 Sep 2023
Hyperbolic Audio-visual Zero-shot Learning
Jie Hong
Zeeshan Hayder
Junlin Han
Pengfei Fang
Mehrtash Harandi
L. Petersson
28
13
0
24 Aug 2023
DAVIS: High-Quality Audio-Visual Separation with Generative Diffusion Models
Chao Huang
Susan Liang
Yapeng Tian
Anurag Kumar
Chenliang Xu
DiffM
21
6
0
31 Jul 2023
Pretrained Language Models as Visual Planners for Human Assistance
Dhruvesh Patel
H. Eghbalzadeh
Nitin Kamra
Michael L. Iuzzolino
Unnat Jain
Ruta Desai
LM&Ro
19
24
0
17 Apr 2023
Affordances from Human Videos as a Versatile Representation for Robotics
Shikhar Bahl
Russell Mendonca
Lili Chen
Unnat Jain
Deepak Pathak
41
164
0
17 Apr 2023
Egocentric Audio-Visual Object Localization
Chao Huang
Yapeng Tian
Anurag Kumar
Chenliang Xu
EgoV
29
28
0
23 Mar 2023
Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
Ziyang Chen
Shengyi Qian
Andrew Owens
26
12
0
20 Mar 2023
Interaction Region Visual Transformer for Egocentric Action Anticipation
Debaditya Roy
Ramanathan Rajendiran
Basura Fernando
36
15
0
25 Nov 2022
Multi-Task Learning of Object State Changes from Uncurated Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
34
11
0
24 Nov 2022
Retrospectives on the Embodied AI Workshop
Matt Deitke
Dhruv Batra
Yonatan Bisk
Tommaso Campari
Angel X. Chang
...
Jesse Thomason
Alexander Toshev
Joanne Truong
Luca Weihs
Jiajun Wu
LM&Ro
37
51
0
13 Oct 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
232
1,024
0
13 Oct 2021
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
215
309
0
19 Oct 2020
1