CM-PIE: Cross-modal perception for interactive-enhanced audio-visual video parsing

11 October 2023

Papers citing "CM-PIE: Cross-modal perception for interactive-enhanced audio-visual video parsing"

8 / 8 papers shown

Title
Towards Open-Vocabulary Audio-Visual Event Localization Jinxing Zhou Dan Guo Ruohao Guo Yuxin Mao Jingjing Hu Yiran Zhong Xiaojun Chang Ming Wang VLM 98 5 0 18 Nov 2024
Audio-Visual Instance Segmentation Ruohao Guo Yaru Chen Yanyu Qi Wenzhen Yue Dantong Niu ... Wenzhen Yue Ji Shi Qixun Wang Peiliang Zhang Buwen Liang VLM VOS 77 2 0 28 Oct 2023
Learning to Answer Questions in Dynamic Audio-Visual Scenarios Guangyao Li Yake Wei Yapeng Tian Chenliang Xu Ji-Rong Wen Di Hu 118 149 0 26 Mar 2022
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing Yapeng Tian Dingzeyu Li Chenliang Xu 99 184 0 21 Jul 2020
Co-Separating Sounds of Visual Objects Ruohan Gao Kristen Grauman 131 209 0 16 Apr 2019
Dual-modality seq2seq network for audio-visual event localization Yan-Bo Lin Yu-Jhe Li Y. Wang 64 128 0 20 Feb 2019
Squeeze-and-Excitation Networks Jie Hu Li Shen Samuel Albanie Gang Sun Enhua Wu 424 26,500 0 05 Sep 2017
CNN Architectures for Large-Scale Audio Classification Shawn Hershey Sourish Chaudhuri D. Ellis J. Gemmeke A. Jansen ... Rif A. Saurous Bryan Seybold M. Slaney Ron J. Weiss K. Wilson 123 2,506 0 29 Sep 2016