ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.03256
  4. Cited By
The Right to Talk: An Audio-Visual Transformer Approach

The Right to Talk: An Audio-Visual Transformer Approach

6 August 2021
Thanh-Dat Truong
C. Duong
T. D. Vu
H. Pham
Bhiksha Raj
Ngan Le
Khoa Luu
ArXivPDFHTML

Papers citing "The Right to Talk: An Audio-Visual Transformer Approach"

13 / 13 papers shown
Title
Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities
Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities
Yidi Li
Yihan Li
Yixin Guo
Bin Ren
Zhenhuan Xu
Hao Guo
Hong Liu
N. Sebe
39
0
0
26 Aug 2024
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down
  Fusion
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion
Samuel Pegg
Kai Li
Xiaolin Hu
27
1
0
25 Jan 2024
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Xiao Wang
Yao Rong
Shiao Wang
Yuan Chen
Zhe Wu
Bowei Jiang
Yonghong Tian
Jin Tang
ViT
76
3
0
18 Dec 2023
HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation
  in Video Understanding
HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation in Video Understanding
Trong-Thuan Nguyen
Pha Nguyen
Khoa Luu
22
12
0
05 Dec 2023
Egocentric Auditory Attention Localization in Conversations
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
29
16
0
28 Mar 2023
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
50
525
0
13 Jun 2022
OTAdapt: Optimal Transport-based Approach For Unsupervised Domain
  Adaptation
OTAdapt: Optimal Transport-based Approach For Unsupervised Domain Adaptation
Thanh-Dat Truong
N. V. R. Chappa
Xuan-Bac Nguyen
Ngan Le
Ashley Dowling
Khoa Luu
OOD
OT
39
11
0
22 May 2022
VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
V. S. Kadandale
Juan F. Montesinos
G. Haro
19
23
0
05 Apr 2022
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
Juan F. Montesinos
V. S. Kadandale
G. Haro
ViT
23
19
0
08 Mar 2022
Visual Acoustic Matching
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
16
55
0
14 Feb 2022
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Hao Jiang
Calvin Murdock
V. Ithapu
EgoV
25
40
0
06 Jan 2022
Self-supervised learning for audio-visual speaker diarization
Self-supervised learning for audio-visual speaker diarization
Yifan Ding
Yong-mei Xu
Shi-Xiong Zhang
Yahuan Cong
Liqiang Wang
VLM
36
29
0
13 Feb 2020
Longitudinal Face Modeling via Temporal Deep Restricted Boltzmann
  Machines
Longitudinal Face Modeling via Temporal Deep Restricted Boltzmann Machines
C. Duong
Khoa Luu
Kha Gia Quach
Tien D. Bui
CVBM
37
44
0
07 Jun 2016
1