Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.10813
Cited By
Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion
27 November 2018
Suwon Shon
Tae-Hyun Oh
James R. Glass
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion"
13 / 13 papers shown
Title
AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Kim Sung-Bin
Oh Hyun-Bin
JungMok Lee
Arda Senocak
Joon Son Chung
Tae-Hyun Oh
MLLM
VLM
50
3
0
23 Oct 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
44
4
0
21 Jul 2024
Dynamic Cross Attention for Audio-Visual Person Verification
R Gnana Praveen
Jahangir Alam
40
1
0
07 Mar 2024
Distilling Privileged Multimodal Information for Expression Recognition using Optimal Transport
Haseeb Aslam
Muhammad Osama Zeeshan
Soufiane Belharbi
M. Pedersoli
A. L. Koerich
Simon L Bacon
Eric Granger
28
9
0
27 Jan 2024
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
Asmar Nadeem
Adrian Hilton
R. Dawes
Graham A. Thomas
A. Mustafa
33
9
0
25 Oct 2023
Audio-Visual Speaker Verification via Joint Cross-Attention
R Gnana Praveen
Jahangir Alam
34
6
0
28 Sep 2023
Audio-Visual Fusion for Emotion Recognition in the Valence-Arousal Space Using Joint Cross-Attention
R Gnana Praveen
Eric Granger
P. Cardinal
CVBM
56
31
0
19 Sep 2022
Learning Audio-Visual embedding for Person Verification in the Wild
Peiwen Sun
Shanshan Zhang
Zishan Liu
Yougen Yuan
Tao Zhang
Honggang Zhang
Pengfei Hu
32
4
0
09 Sep 2022
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
46
55
0
20 Aug 2022
Audio-Visual Person-of-Interest DeepFake Detection
D. Cozzolino
Alessandro Pianese
Matthias Nießner
L. Verdoliva
36
61
0
06 Apr 2022
SpeakingFaces: A Large-Scale Multimodal Dataset of Voice Commands with Visual and Thermal Video Streams
Madina Abdrakhmanova
Askat Kuzdeuov
Sheikh Jarju
Yerbolat Khassanov
Michael Lewis
H. A. Varol
CVBM
17
58
0
05 Dec 2020
Modality Dropout for Improved Performance-driven Talking Faces
Ahmed Hussen Abdelaziz
B. Theobald
Paul Dixon
Reinhard Knothe
N. Apostoloff
Sachin Kajareker
24
37
0
27 May 2020
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
266
2,242
0
14 Jun 2018
1