ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.11946
  4. Cited By
Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

17 February 2025
Ailin Huang
Boyong Wu
Bruce Wang
Chao Yan
Chen Hu
Chengli Feng
Fei Tian
Feiyu Shen
Jiashi Li
Mengzhao Chen
Peng Liu
Ruihang Miao
Wang You
Xi Chen
Xuerui Yang
Yuanmin Huang
Yuxiang Zhang
Zheng Gong
Zixin Zhang
Hongyu Zhou
Jianjian Sun
B. Li
Chengting Feng
Changyi Wan
Hanpeng Hu
Jianchang Wu
Jiangjie Zhen
Ranchen Ming
Song Yuan
Xinming Zhang
Yu Zhou
Yangqiu Song
Buyun Ma
Haoran Wang
Kang An
Wei Ji
W. Li
Xuan Wen
Xiangwen Kong
Yuankai Ma
Yuanwei Liang
Yun Mou
Bahtiyar Ahmidi
Bin Wang
Bo-wen Li
Changxin Miao
C. Xu
Chenrun Wang
Dapeng Shi
Deshan Sun
Dingyuan Hu
Dula Sai
Enle Liu
Guanzhe Huang
Gulin Yan
Han Wang
Haonan Jia
H. Zhang
Jiahao Gong
J. Guo
Xiaozhong Liu
Jing Liu
Jie Feng
Jie Wu
J. Wu
Jie Yang
J. T. Wang
Jingyang Zhang
Junzhe Lin
K. Li
Lei Xia
Li Zhou
Liang Zhao
Longlong Gu
Mei Chen
Menglin Wu
Ming Li
Mingxiao Li
M. Li
Mingyao Liang
Na Wang
Nie Hao
Qiling Wu
Qinyuan Tan
R.-H. Sun
Shri Kiran Srinivasan
Shaoliang Pang
Steve Yang
Shuli Gao
Shanshan Yuan
Siqi Liu
Shihong Deng
Shilei Jiang
Shixuan Liu
Tiancheng Cao
Tianyu Wang
Wenjin Deng
Wuxun Xie
Weipeng Ming
Wenqing He
    AuLLM
ArXivPDFHTML

Papers citing "Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction"

7 / 7 papers shown
Title
Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese
Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese
Xinyu Wang
Ziyi Zhao
Siyu Ren
Shao Zhang
Song Li
...
Lin Qiu
Guanglu Wan
Xuezhi Cao
Xunliang Cai
Weinan Zhang
ALM
32
0
0
16 May 2025
WavReward: Spoken Dialogue Models With Generalist Reward Evaluators
WavReward: Spoken Dialogue Models With Generalist Reward Evaluators
Shengpeng Ji
Tianle Liang
Yong Li
Jialong Zuo
Minghui Fang
...
Xize Cheng
Siqi Zheng
Jin Xu
Junyang Lin
Zhou Zhao
AuLLM
ALM
33
0
0
14 May 2025
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
Weiyu Li
Xuanyang Zhang
Zheng Sun
Di Qi
Yiming Li
...
Zeming Li
Gang Yu
Xiangyu Zhang
Daxin Jiang
Ping Tan
46
0
0
12 May 2025
Muyan-TTS: A Trainable Text-to-Speech Model Optimized for Podcast Scenarios with a $50K Budget
Muyan-TTS: A Trainable Text-to-Speech Model Optimized for Podcast Scenarios with a 50KBudget50K Budget50KBudget
Xin Li
Kaikai Jia
Hao Sun
Jun Dai
Z. L. Jiang
158
0
0
27 Apr 2025
Kimi-Audio Technical Report
Kimi-Audio Technical Report
KimiTeam
Ding Ding
Zeqian Ju
Yichong Leng
Shixuan Liu
...
Zhengyuan Yang
Aoxiong Yin
Ruibin Yuan
Wenjie Qu
Zaida Zhou
AuLLM
VLM
110
5
0
25 Apr 2025
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Hongcheng Gao
Jiashu Qu
Jingyi Tang
Baolong Bi
Yi Liu
Hongyu Chen
Li Liang
Li Su
Qingming Huang
MLLM
VLM
LRM
85
5
0
25 Mar 2025
Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models
Xu Liu
Taha Aksu
Juncheng Liu
Qingsong Wen
Keli Zhang
Caiming Xiong
Shri Kiran Srinivasan
Doyen Sahoo
Junnan Li
Chenghao Liu
AI4TS
47
0
0
14 Mar 2025
1