v1v2 (latest)

Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization

30 June 2018

Papers citing "Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization"

50 / 316 papers shown

Title
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition Yikang Shen Chun-Fu Chen Quanfu Fan Ximeng Sun Kate Saenko A. Oliva Rogerio Feris 97 50 0 11 May 2021
Contrastive Attraction and Contrastive Repulsion for Representation Learning Huangjie Zheng Xu Chen Jiangchao Yao Hongxia Yang Chunyuan Li Ya Zhang Hao Zhang Ivor Tsang Jingren Zhou Mingyuan Zhou SSL 115 12 0 08 May 2021
Motion-Augmented Self-Training for Video Recognition at Smaller Scale Kirill Gavrilyuk Mihir Jain I. Karmanov Cees G. M. Snoek 75 21 0 04 May 2021
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning Christoph Feichtenhofer Haoqi Fan Bo Xiong Ross B. Girshick Kaiming He SSL AI4TS 130 263 0 29 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text Hassan Akbari Liangzhe Yuan Rui Qian Wei-Hong Chuang Shih-Fu Chang Huayu Chen Boqing Gong ViT 410 594 0 22 Apr 2021
Detection of Audio-Video Synchronization Errors Via Event Detection Joshua Peter Ebenezer Yongjun Wu Hai Wei S. Sethuraman Z. Liu 65 12 0 20 Apr 2021
Visually Guided Sound Source Separation and Localization using Self-Supervised Motion Representations Lingyu Zhu Esa Rahtu 83 27 0 17 Apr 2021
Self-supervised object detection from audio-visual correspondence Triantafyllos Afouras Yuki M. Asano Francois Fagan Andrea Vedaldi Florian Metze SSL 115 47 0 13 Apr 2021
Visually Informed Binaural Audio Generation without Binaural Audios Xudong Xu Hang Zhou Ziwei Liu Bo Dai Xiaogang Wang Dahua Lin DiffM 51 59 0 13 Apr 2021
Contrastive Learning of Global-Local Video Representations Shuang Ma Zhaoyang Zeng Daniel J. McDuff Yale Song SSL 108 7 0 07 Apr 2021
Can audio-visual integration strengthen robustness under multimodal attacks? Yapeng Tian Chenliang Xu AAML 109 39 0 05 Apr 2021
Cross-Modal learning for Audio-Visual Video Parsing Jatin Lamba Abhishek Jayaprakash Akula Rishabh Dabral Preethi Jyothi Ganesh Ramakrishnan 142 8 0 03 Apr 2021
Self-supervised Video Representation Learning by Context and Motion Decoupling Lianghua Huang Yu Liu Bin Wang Pan Pan Yinghui Xu Rong Jin SSL 124 51 0 02 Apr 2021
Multiview Pseudo-Labeling for Semi-supervised Learning from Video Bo Xiong Haoqi Fan Kristen Grauman Christoph Feichtenhofer SSL 86 51 0 01 Apr 2021
Unsupervised Sound Localization via Iterative Contrastive Learning Yan-Bo Lin Hung-Yu Tseng Hsin-Ying Lee Yen-Yu Lin Ming-Hsuan Yang SSL 105 36 0 01 Apr 2021
Broaden Your Views for Self-Supervised Video Learning Adrià Recasens Pauline Luc Jean-Baptiste Alayrac Luyu Wang Ross Hemsley ... Florent Altché M. Valko Jean-Bastien Grill Aaron van den Oord Andrew Zisserman SSL AI4TS 141 128 0 30 Mar 2021
Robust Audio-Visual Instance Discrimination Pedro Morgado Ishan Misra Nuno Vasconcelos SSL 117 110 0 29 Mar 2021
Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting A. Bhunia Pinaki Nath Chowdhury Yongxin Yang Timothy M. Hospedales Tao Xiang Yi-Zhe Song SSL 114 62 0 25 Mar 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning Mandela Patrick Yuki M. Asano Bernie Huang Ishan Misra Florian Metze Joao Henriques Andrea Vedaldi AI4TS 106 35 0 18 Mar 2021
Multi-Format Contrastive Learning of Audio Representations Luyu Wang Aaron van den Oord 95 59 0 11 Mar 2021
VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples Tian Pan Yibing Song Tianyu Yang Wenhao Jiang Wei Liu 106 226 0 10 Mar 2021
Self-Supervised Multi-View Learning via Auto-Encoding 3D Transformations Xiang Gao Wei Hu Guo-Jun Qi 91 6 0 01 Mar 2021
Learning Audio-Visual Correlations from Variational Cross-Modal Generation Ye Zhu Yu Wu Hugo Latapie Yi Yang Yan Yan SSL 132 21 0 05 Feb 2021
Semi-Supervised Action Recognition with Temporal Contrastive Learning Ankit Singh Omprakash Chakraborty Ashutosh Varshney Yikang Shen Rogerio Feris Kate Saenko Abir Das 97 99 0 04 Feb 2021
Self-Supervised Pretraining for RGB-D Salient Object Detection Xiaoqi Zhao Youwei Pang Lihe Zhang Huchuan Lu Xiang Ruan 97 64 0 29 Jan 2021
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning Sangho Lee Jiwan Chung Youngjae Yu Gunhee Kim Thomas Breuel Gal Chechik Yale Song 196 47 0 26 Jan 2021
Learning rich touch representations through cross-modal self-supervision Martina Zambelli Y. Aytar Francesco Visin Yuxiang Zhou R. Hadsell SSL 84 16 0 21 Jan 2021
Learning from Weakly-labeled Web Videos via Exploring Sub-Concepts Kunpeng Li Zizhao Zhang Guanhang Wu Xuehan Xiong Chen-Yu Lee Zhichao Lu Y. Fu Tomas Pfister 80 5 0 11 Jan 2021
Transformers in Vision: A Survey Salman Khan Muzammal Naseer Munawar Hayat Syed Waqas Zamir Fahad Shahbaz Khan M. Shah ViT 455 2,570 0 04 Jan 2021
Human Action Recognition from Various Data Modalities: A Review Zehua Sun Qiuhong Ke Hossein Rahmani Mohammed Bennamoun Gang Wang Jun Liu MU 186 536 0 22 Dec 2020
Semantic Audio-Visual Navigation Changan Chen Ziad Al-Halah Kristen Grauman 127 106 0 21 Dec 2020
Temporal Relational Modeling with Self-Supervision for Action Segmentation Dong Wang Di Hu Xingjian Li Dejing Dou 93 53 0 14 Dec 2020
InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees Nghi D. Q. Bui Yijun Yu Lingxiao Jiang SSL 74 107 0 13 Dec 2020
A Comprehensive Study of Deep Video Action Recognition Yi Zhu Xinyu Li Chunhui Liu Mohammadreza Zolfaghari Yuanjun Xiong Chongruo Wu Zhi-Li Zhang Joseph Tighe R. Manmatha Mu Li VLM AI4TS 131 188 0 11 Dec 2020
Parameter Efficient Multimodal Transformers for Video Representation Learning Sangho Lee Youngjae Yu Gunhee Kim Thomas Breuel Jan Kautz Yale Song ViT 120 78 0 08 Dec 2020
Rethinking movie genre classification with fine-grained semantic clustering Edward Fish Jon Weinbren Andrew Gilbert VLM 79 7 0 04 Dec 2020
Recent Progress in Appearance-based Action Recognition J. Humphreys Zhe Chen Dacheng Tao 60 0 0 25 Nov 2020
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks Humam Alwassel Silvio Giancola Guohao Li 96 126 0 23 Nov 2020
Hierarchically Decoupled Spatial-Temporal Contrast for Self-supervised Video Representation Learning Zehua Zhang David J. Crandall AI4TS SSL 86 23 0 23 Nov 2020
ActBERT: Learning Global-Local Video-Text Representations Linchao Zhu Yi Yang ViT 147 423 0 14 Nov 2020
Learning Representations from Audio-Visual Spatial Alignment Pedro Morgado Yi Li Nuno Vasconcelos SSL 95 125 0 03 Nov 2020
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds Efthymios Tzinis Scott Wisdom A. Jansen Shawn Hershey Tal Remez D. Ellis J. Hershey 96 71 0 02 Nov 2020
Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning L. Tao Xueting Wang T. Yamasaki VLM SSL 104 14 0 29 Oct 2020
i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning Kibok Lee Yian Zhu Kihyuk Sohn Chun-Liang Li Jinwoo Shin Honglak Lee SSL 98 26 0 17 Oct 2020
Back to the Future: Cycle Encoding Prediction for Self-supervised Contrastive Video Representation Learning Xinyu Yang Majid Mirmehdi T. Burghardt 98 4 0 14 Oct 2020
Hard Negative Mixing for Contrastive Learning Yannis Kalantidis Mert Bulent Sariyildiz Noé Pion Philippe Weinzaepfel Diane Larlus SSL 180 648 0 02 Oct 2020
Learn like a Pathologist: Curriculum Learning by Annotator Agreement for Histopathology Image Classification Jerry W. Wei A. Suriawinata Bing Ren Xiaoying Liu Mikhail Lisovsky ... Mustafa Nasir-Moin Naofumi Tomita Lorenzo Torresani Jason W. Wei Saeed Hassanpour 110 49 0 29 Sep 2020
Sense and Learn: Self-Supervision for Omnipresent Sensors Aaqib Saeed Victor Ungureanu Beat Gfeller OOD SSL 86 41 0 28 Sep 2020
Improving colonoscopy lesion classification using semi-supervised deep learning M. Golhar Taylor L. Bobrow M. Khoshknab S. Jit S. Ngamruengphong Nicholas J. Durr SSL 68 16 0 07 Sep 2020
Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformations Nghi D. Q. Bui Yijun Yu Lingxiao Jiang SSL 173 122 0 06 Sep 2020