ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.02611
  4. Cited By
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand
  Audio-Visual Information?

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

3 December 2024
Kaixiong Gong
Kaituo Feng
Yangqiu Song
Yibing Wang
Mofan Cheng
Shijia Yang
Jiaming Han
Benyou Wang
Yutong Bai
Zhiyong Yang
Xiangyu Yue
    MLLMAuLLMVLM
ArXiv (abs)PDFHTML

Papers citing "AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"

10 / 10 papers shown
Title
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
Sanjoy Chowdhury
Mohamed Elmoghany
Yohan Abeysinghe
Junjie Fei
Sayan Nag
Salman Khan
Mohamed Elhoseiny
Dinesh Manocha
19
0
0
08 Jun 2025
AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs
Lidong Lu
Guo Chen
Z. Li
Yicheng Liu
Tong Lu
VLMLRM
101
0
0
05 Jun 2025
Learning Sparsity for Effective and Efficient Music Performance Question Answering
Learning Sparsity for Effective and Efficient Music Performance Question Answering
Xingjian Diao
Tianzhen Yang
Chunhui Zhang
Weiyi Wu
Ming Cheng
Jiang Gui
60
1
0
02 Jun 2025
Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities
Ziwei Zhou
Rui Wang
Zuxuan Wu
AuLLMVGen
77
0
0
23 May 2025
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury
Hanan Gani
Nishit Anand
Sayan Nag
Ruohan Gao
Mohamed Elhoseiny
Salman Khan
Dinesh Manocha
LRM
177
1
0
29 Mar 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Yansen Wang
Shengqiong Wu
Yize Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
208
31
0
16 Mar 2025
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information
Feng Jiang
Zhiyu Lin
Fan Bu
Yuhao Du
Benyou Wang
Haoyang Li
AuLLMELM
130
2
0
07 Mar 2025
OneLLM: One Framework to Align All Modalities with Language
OneLLM: One Framework to Align All Modalities with Language
Jiaming Han
Kaixiong Gong
Yiyuan Zhang
Jiaqi Wang
Kaipeng Zhang
Dahua Lin
Yu Qiao
Peng Gao
Xiangyu Yue
MLLM
252
134
0
10 Jan 2025
OmniBench: Towards The Future of Universal Omni-Language Models
OmniBench: Towards The Future of Universal Omni-Language Models
Yizhi Li
Ge Zhang
Yinghao Ma
Ruibin Yuan
Kang Zhu
...
Zhaoxiang Zhang
Zachary Liu
Emmanouil Benetos
Wenhao Huang
Chenghua Lin
LRM
164
19
0
23 Sep 2024
Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models
Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models
Potsawee Manakul
Guangzhi Sun
Warit Sirichotedumrong
Kasima Tharnpipitchai
Kunat Pipatanakul
AuLLM
118
7
0
17 Sep 2024
1