ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.18425
  4. Cited By
Kimi-Audio Technical Report

Kimi-Audio Technical Report

25 April 2025
KimiTeam
Ding Ding
Zeqian Ju
Yichong Leng
Shixuan Liu
Tianming Liu
Zeyu Shang
Kai Shen
Wei Song
Xu Tan
Hao Tang
Zehao Wang
Chu Wei
Yifei Xin
Xinran Xu
Jianwei Yu
Y. Zhang
Xinyu Zhou
Y. Charles
Jianfei Chen
Yuxiao Chen
Yulun Du
Weiran He
Zhenxing Hu
Guokun Lai
Qingcheng Li
Yang Liu
Weidong Sun
Jiadong Wang
Yijiao Wang
Y. Wu
Yuxin Wu
Dongchao Yang
Hao Yang
Yiran Yang
Zhiyong Yang
Aoxiong Yin
Ruibin Yuan
Yanzhe Zhang
Zaida Zhou
    AuLLMVLM
ArXiv (abs)PDFHTML

Papers citing "Kimi-Audio Technical Report"

18 / 18 papers shown
Title
Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Ailin Huang
B. Li
Bruce Wang
Boyong Wu
Chao Yan
...
X. Zhang
Yibo Zhu
Daxin Jiang
Shuchang Zhou
Chen-Hao Hu
AuLLM
82
0
0
10 Jun 2025
Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model
Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model
Haibin Wu
Yuxuan Hu
Ruchao Fan
Xiaofei Wang
K. Kumatani
...
J. Yu
Heng Lu
Lijuan Wang
Y. Qian
Jinyu Li
AuLLM
67
0
0
04 Jun 2025
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning
Zebin You
Shen Nie
Xiaolu Zhang
Jun Hu
Jun Zhou
Zhiwu Lu
J. Wen
Chongxuan Li
MLLMVLM
114
2
0
22 May 2025
SALMONN-omni: A Standalone Speech LLM without Codec Injection for Full-duplex Conversation
SALMONN-omni: A Standalone Speech LLM without Codec Injection for Full-duplex Conversation
Wenyi Yu
Siyin Wang
Xiaoyu Yang
Xianzhao Chen
Xiaohai Tian
Jun Zhang
Guangzhi Sun
Lu Lu
Yuxuan Wang
Chao Zhang
AuLLM
98
0
0
17 May 2025
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
Andrew Rouditchenko
Saurabhchand Bhati
Edson Araujo
Samuel Thomas
Hilde Kuehne
Rogerio Feris
James R. Glass
AuLLMVLM
111
0
0
14 May 2025
WavReward: Spoken Dialogue Models With Generalist Reward Evaluators
WavReward: Spoken Dialogue Models With Generalist Reward Evaluators
Shengpeng Ji
Tianle Liang
Yongqian Li
Jialong Zuo
Minghui Fang
...
Xize Cheng
Siqi Zheng
Jin Xu
Junyang Lin
Zhou Zhao
AuLLMALM
129
0
0
14 May 2025
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
Weiyu Li
Xiao-Yong Zhang
Zheng Sun
Di Qi
Haoyang Li
...
Zeming Li
Gang Yu
Xiangyu Zhang
Daxin Jiang
Ping Tan
141
3
0
12 May 2025
Bridging Ears and Eyes: Analyzing Audio and Visual Large Language Models to Humans in Visible Sound Recognition and Reducing Their Sensory Gap via Cross-Modal Distillation
Bridging Ears and Eyes: Analyzing Audio and Visual Large Language Models to Humans in Visible Sound Recognition and Reducing Their Sensory Gap via Cross-Modal Distillation
Xilin Jiang
Junkai Wu
Vishal B. Choudhari
N. Mesgarani
VLM
84
0
0
11 May 2025
A Synergistic Framework of Nonlinear Acoustic Computing and Reinforcement Learning for Real-World Human-Robot Interaction
A Synergistic Framework of Nonlinear Acoustic Computing and Reinforcement Learning for Real-World Human-Robot Interaction
Xiaoliang Chen
Xin Yu
Le Chang
Yunhe Huang
Jiashuai He
...
Jin Li
Likai Lin
Ziyu Zeng
Xianling Tu
Shuyu Zhang
110
1
0
04 May 2025
Qwen2.5-Omni Technical Report
Qwen2.5-Omni Technical Report
Jin Xu
Zhifang Guo
Jinzheng He
Hangrui Hu
Ting He
...
K. Dang
Bin Zhang
Xinyu Wang
Yunfei Chu
Junyang Lin
VGenAuLLM
169
55
0
26 Mar 2025
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
Tianpeng Li
Qingbin Liu
Tao Zhang
Yuanbo Fang
Zheng Liang
...
Bin Cui
Jianhua Xu
Haoze Sun
Guosheng Dong
Xin Wu
AuLLM
119
7
0
24 Feb 2025
Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction
Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction
Ailin Huang
Boyong Wu
Bruce Wang
Chao Yan
Chen Hu
...
Tianyu Wang
Wenjin Deng
Wuxun Xie
Weipeng Ming
Wenqing He
AuLLM
128
17
0
17 Feb 2025
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation
Haorui He
Zengqiang Shang
Chaoren Wang
Xuyuan Li
Yicheng Gu
...
Peiyang Shi
Yansen Wang
Kai Chen
Pengyuan Zhang
Zhikai Wu
AuLLM
143
5
0
28 Jan 2025
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia
Xuelong Geng
Kun Wei
Qijie Shao
Shuiyun Liu
Zhennan Lin
...
Yuhang Dai
Xinfa Zhu
Yue Li
Li Zhang
Lei Xie
142
5
0
23 Jan 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zihao Huang
Ziyao Xu
Zhiyong Yang
Zonghan Yang
Zongyu Lin
OffRLALMAI4TSVLMLRM
357
338
0
22 Jan 2025
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Qian Chen
Yafeng Chen
Yanni Chen
Mengzhe Chen
Yuxiao Chen
...
Shiliang Zhang
Nan Zhao
Pei Zhang
Chuxu Zhang
Jinren Zhou
AuLLMMLLM
116
24
0
10 Jan 2025
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Qinglin Zhang
Luyao Cheng
Chong Deng
Qian Chen
Wen Wang
...
Jiaqing Liu
Hai Yu
Chaohong Tan
Zhihao Du
Shiliang Zhang
SyDaBDLAuLLMVLM
146
20
0
23 Oct 2024
Quality-aware Masked Diffusion Transformer for Enhanced Music Generation
Quality-aware Masked Diffusion Transformer for Enhanced Music Generation
Chang Li
Ruoyu Wang
Lijuan Liu
Jun Du
Yixuan Sun
Zilu Guo
Zhenrong Zhang
Yuan Jiang
J. Gao
Feng Ma
127
5
0
24 May 2024
1