Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.07517
Cited By
CM-PIE: Cross-modal perception for interactive-enhanced audio-visual video parsing
11 October 2023
Yaru Chen
Ruohao Guo
Xubo Liu
Peipei Wu
Guangyao Li
Zhenbo Li
Wenwu Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CM-PIE: Cross-modal perception for interactive-enhanced audio-visual video parsing"
8 / 8 papers shown
Title
Towards Open-Vocabulary Audio-Visual Event Localization
Jinxing Zhou
Dan Guo
Ruohao Guo
Yuxin Mao
Jingjing Hu
Yiran Zhong
Xiaojun Chang
Ming Wang
VLM
98
5
0
18 Nov 2024
Audio-Visual Instance Segmentation
Ruohao Guo
Yaru Chen
Yanyu Qi
Wenzhen Yue
Dantong Niu
...
Wenzhen Yue
Ji Shi
Qixun Wang
Peiliang Zhang
Buwen Liang
VLM
VOS
77
2
0
28 Oct 2023
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
118
149
0
26 Mar 2022
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
Yapeng Tian
Dingzeyu Li
Chenliang Xu
99
184
0
21 Jul 2020
Co-Separating Sounds of Visual Objects
Ruohan Gao
Kristen Grauman
131
209
0
16 Apr 2019
Dual-modality seq2seq network for audio-visual event localization
Yan-Bo Lin
Yu-Jhe Li
Y. Wang
64
128
0
20 Feb 2019
Squeeze-and-Excitation Networks
Jie Hu
Li Shen
Samuel Albanie
Gang Sun
Enhua Wu
424
26,500
0
05 Sep 2017
CNN Architectures for Large-Scale Audio Classification
Shawn Hershey
Sourish Chaudhuri
D. Ellis
J. Gemmeke
A. Jansen
...
Rif A. Saurous
Bryan Seybold
M. Slaney
Ron J. Weiss
K. Wilson
123
2,506
0
29 Sep 2016
1