Sound and Visual Representation Learning with Multiple Pretraining Tasks

4 January 2022

Luc Van Gool

Papers citing "Sound and Visual Representation Learning with Multiple Pretraining Tasks"

38 / 38 papers shown

Title
Dense Semantic Contrast for Self-Supervised Visual Representation Learning Xiaoni Li Yu Zhou Yifei Zhang Aoting Zhang Wei Wang Ning Jiang Haiying Wu Weiping Wang SSL 56 41 0 16 Sep 2021
Multi-Task Self-Training for Learning General Representations Golnaz Ghiasi Barret Zoph E. D. Cubuk Quoc V. Le Nayeon Lee SSL 82 101 0 25 Aug 2021
Efficient Visual Pretraining with Contrastive Detection Olivier J. Hénaff Skanda Koppula Jean-Baptiste Alayrac Aaron van den Oord Oriol Vinyals João Carreira VLM SSL 71 165 0 19 Mar 2021
DetCo: Unsupervised Contrastive Learning for Object Detection Enze Xie Jian Ding Wenhai Wang Xiaohang Zhan Hang Xu Peize Sun Zhenguo Li Ping Luo 77 323 0 09 Feb 2021
Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation Lukas Hoyer Dengxin Dai Yuhua Chen Adrian Köring Suman Saha Luc Van Gool 3DPC SSL MDE 86 105 0 19 Dec 2020
Exploring Simple Siamese Representation Learning Xinlei Chen Kaiming He SSL 258 4,067 0 20 Nov 2020
Dense Contrastive Learning for Self-Supervised Visual Pre-Training Xinlong Wang Rufeng Zhang Chunhua Shen Tao Kong Lei Li SSL 77 689 0 18 Nov 2020
Learning Representations from Audio-Visual Spatial Alignment Pedro Morgado Yi Li Nuno Vasconcelos SSL 74 123 0 03 Nov 2020
Self-supervised Co-training for Video Representation Learning Tengda Han Weidi Xie Andrew Zisserman SSL 242 319 0 19 Oct 2020
Contrastive learning of global and local features for medical image segmentation with limited annotations K. Chaitanya Ertunc Erdil Neerav Karani E. Konukoglu SSL 86 552 0 18 Jun 2020
Bootstrap your own latent: A new approach to self-supervised Learning Jean-Bastien Grill Florian Strub Florent Altché Corentin Tallec Pierre Harvey Richemond ... M. G. Azar Bilal Piot Koray Kavukcuoglu Rémi Munos Michal Valko SSL 374 6,833 0 13 Jun 2020
Audio-Visual Instance Discrimination with Cross-Modal Agreement Pedro Morgado Nuno Vasconcelos Ishan Misra SSL 80 276 0 27 Apr 2020
Improved Baselines with Momentum Contrastive Learning Xinlei Chen Haoqi Fan Ross B. Girshick Kaiming He SSL 486 3,442 0 09 Mar 2020
Multi-task self-supervised learning for Robust Speech Recognition Mirco Ravanelli Jianyuan Zhong Santiago Pascual P. Swietojanski João Monteiro J. Trmal Yoshua Bengio SSL 279 290 0 25 Jan 2020
Self-Supervised Learning of Pretext-Invariant Representations Ishan Misra Laurens van der Maaten SSL VLM 108 1,458 0 04 Dec 2019
Momentum Contrast for Unsupervised Visual Representation Learning Kaiming He Haoqi Fan Yuxin Wu Saining Xie Ross B. Girshick SSL 207 12,121 0 13 Nov 2019
Vision-Infused Deep Audio Inpainting Hang Zhou Ziwei Liu Lingfeng Guo Ping Luo Dahua Lin 138 88 0 24 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations Zhenzhong Lan Mingda Chen Sebastian Goodman Kevin Gimpel Piyush Sharma Radu Soricut SSL AIMat 371 6,463 0 26 Sep 2019
Multi-Task Self-Supervised Learning for Disfluency Detection Shaolei Wang Wanxiang Che Qi Liu Pengda Qin Ting Liu William Yang Wang SSL 67 56 0 15 Aug 2019
Learning Representations by Maximizing Mutual Information Across Views Philip Bachman R. Devon Hjelm William Buchwalter SSL 195 1,476 0 03 Jun 2019
Self-supervised audio representation learning for mobile devices Marco Tagliasacchi Beat Gfeller Félix de Chaumont Quitry Dominik Roblek SSL AI4TS 64 47 0 24 May 2019
Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey Longlong Jing Yingli Tian SSL 153 1,700 0 16 Feb 2019
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation Hang Zhou Yu Liu Ziwei Liu Ping Luo Xiaogang Wang CVBM 92 442 0 20 Jul 2018
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization Bruno Korbar Du Tran Lorenzo Torresani 99 476 0 30 Jun 2018
Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination Zhirong Wu Yuanjun Xiong Stella X. Yu Dahua Lin SSL 179 3,465 0 05 May 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features Andrew Owens Alexei A. Efros SSL 98 753 0 10 Apr 2018
The Sound of Pixels Hang Zhao Chuang Gan Andrew Rouditchenko Carl Vondrick Josh H. McDermott Antonio Torralba VLM 102 536 0 09 Apr 2018
Deep contextualized word representations Matthew E. Peters Mark Neumann Mohit Iyyer Matt Gardner Christopher Clark Kenton Lee Luke Zettlemoyer NAI 227 11,565 0 15 Feb 2018
Objects that Sound Relja Arandjelović Andrew Zisserman ObjD VOS 110 530 0 18 Dec 2017
Multi-task Self-Supervised Visual Learning Carl Doersch Andrew Zisserman SSL 80 634 0 25 Aug 2017
Audio Super Resolution using Neural Networks Volodymyr Kuleshov S. Enam Stefano Ermon SupR 76 127 0 02 Aug 2017
Look, Listen and Learn Relja Arandjelović Andrew Zisserman SSL 120 906 0 23 May 2017
Colorization as a Proxy Task for Visual Understanding Gustav Larsson Michael Maire Gregory Shakhnarovich SSL 159 498 0 11 Mar 2017
Vid2speech: Speech Reconstruction from Silent Video Ariel Ephrat Shmuel Peleg 90 123 0 02 Jan 2017
CNN Architectures for Large-Scale Audio Classification Shawn Hershey Sourish Chaudhuri D. Ellis J. Gemmeke A. Jansen ... Rif A. Saurous Bryan Seybold M. Slaney Ron J. Weiss K. Wilson 123 2,506 0 29 Sep 2016
Learning without Forgetting Zhizhong Li Derek Hoiem CLL OOD SSL 304 4,423 0 29 Jun 2016
Context Encoders: Feature Learning by Inpainting Deepak Pathak Philipp Krahenbuhl Jeff Donahue Trevor Darrell Alexei A. Efros SSL 67 5,299 0 25 Apr 2016
Unsupervised Visual Representation Learning by Context Prediction Carl Doersch Abhinav Gupta Alexei A. Efros DRL SSL 169 2,789 0 19 May 2015