Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.12667
Cited By
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
28 November 2019
Humam Alwassel
D. Mahajan
Bruno Korbar
Lorenzo Torresani
Guohao Li
Du Tran
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Supervised Learning by Cross-Modal Audio-Video Clustering"
11 / 111 papers shown
Title
Delving into Inter-Image Invariance for Unsupervised Visual Representations
Jiahao Xie
Xiaohang Zhan
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
SSL
VLM
21
58
0
26 Aug 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
SSL
36
106
0
13 Aug 2020
What Should Not Be Contrastive in Contrastive Learning
Tete Xiao
Xiaolong Wang
Alexei A. Efros
Trevor Darrell
SSL
DRL
35
298
0
13 Aug 2020
Spatiotemporal Contrastive Video Representation Learning
Rui Qian
Tianjian Meng
Boqing Gong
Ming-Hsuan Yang
Haoran Wang
Serge J. Belongie
Huayu Chen
SSL
AI4TS
41
492
0
09 Aug 2020
Learning Video Representations from Textual Web Supervision
Jonathan C. Stroud
Zhichao Lu
Chen Sun
Jia Deng
Rahul Sukthankar
Cordelia Schmid
David A. Ross
SSL
40
48
0
29 Jul 2020
Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision
Abhinav Shukla
Stavros Petridis
M. Pantic
SSL
32
16
0
08 Jul 2020
Self-Supervised MultiModal Versatile Networks
Jean-Baptiste Alayrac
Adrià Recasens
R. Schneider
Relja Arandjelović
Jason Ramapuram
J. Fauw
Lucas Smaira
Sander Dieleman
Andrew Zisserman
SSL
40
371
0
29 Jun 2020
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Brian Chen
D. Joshi
...
Rogerio Feris
Brian Kingsbury
M. Picheny
Antonio Torralba
James R. Glass
SSL
22
141
0
16 Jun 2020
Are we done with ImageNet?
Lucas Beyer
Olivier J. Hénaff
Alexander Kolesnikov
Xiaohua Zhai
Aaron van den Oord
VLM
19
397
0
12 Jun 2020
Visually Guided Sound Source Separation using Cascaded Opponent Filter Network
Lingyu Zhu
Esa Rahtu
22
23
0
04 Jun 2020
Does Visual Self-Supervision Improve Learning of Speech Representations for Emotion Recognition?
Abhinav Shukla
Stavros Petridis
M. Pantic
SSL
32
28
0
04 May 2020
Previous
1
2
3