ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.13530
  4. Cited By
Listen Then See: Video Alignment with Speaker Attention

Listen Then See: Video Alignment with Speaker Attention

21 April 2024
Aviral Agrawal
Carlos Mateo Samudio Lezcano
Iqui Balam Heredia-Marin
P. Sethi
ArXivPDFHTML

Papers citing "Listen Then See: Video Alignment with Speaker Attention"

5 / 5 papers shown
Title
Long Video Understanding with Learnable Retrieval in Video-Language Models
Long Video Understanding with Learnable Retrieval in Video-Language Models
Jiaqi Xu
Cuiling Lan
Wenxuan Xie
Xuejin Chen
Yan Lu
107
7
0
24 Feb 2025
Social Genome: Grounded Social Reasoning Abilities of Multimodal Models
Social Genome: Grounded Social Reasoning Abilities of Multimodal Models
Leena Mathur
Marian Qian
Paul Pu Liang
Louis-Philippe Morency
LRM
154
1
0
21 Feb 2025
Powerset multi-class cross entropy loss for neural speaker diarization
Powerset multi-class cross entropy loss for neural speaker diarization
Alexis Plaquet
H. Bredin
106
91
0
19 Oct 2023
Learning Interactions and Relationships between Movie Characters
Learning Interactions and Relationships between Movie Characters
Anna Kukleva
Makarand Tapaswi
Ivan Laptev
38
51
0
29 Mar 2020
pyannote.audio: neural building blocks for speaker diarization
pyannote.audio: neural building blocks for speaker diarization
H. Bredin
Ruiqing Yin
Juan Manuel Coria
G. Gelly
Pavel Korshunov
Marvin Lavechin
D. Fustes
Hadrien Titeux
Wassim Bouaziz
Marie-Philippe Gill
191
312
0
04 Nov 2019
1