ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.02765
  4. Cited By
HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio
  Codec
v1v2 (latest)

HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec

4 May 2023
Dongchao Yang
Songxiang Liu
Rongjie Huang
Jinchuan Tian
Chao Weng
Yuexian Zou
ArXiv (abs)PDFHTMLGithub (620★)

Papers citing "HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec"

27 / 27 papers shown
Title
FreeCodec: A disentangled neural speech codec with fewer tokens
FreeCodec: A disentangled neural speech codec with fewer tokens
Youqiang Zheng
Weiping Tu
Yueteng Kang
Jie Chen
Yike Zhang
Li Xiao
Yuhong Yang
Long Ma
134
4
0
01 Jul 2025
Vision-Integrated High-Quality Neural Speech Coding
Vision-Integrated High-Quality Neural Speech Coding
Yao Guo
Yang Ai
Rui Zheng
Hui-Peng Du
Xiao-Hang Jiang
Zhen-Hua Ling
42
0
0
29 May 2025
EASY: Emotion-aware Speaker Anonymization via Factorized Distillation
EASY: Emotion-aware Speaker Anonymization via Factorized Distillation
Jixun Yao
Hexin Liu
Eng Siong Chng
Lei Xie
57
0
0
21 May 2025
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
Dianwen Ng
Kun Zhou
Yi-Wen Chao
Zhiwei Xiong
B. Ma
Eng Siong Chng
93
0
0
12 May 2025
USM-VC: Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion
USM-VC: Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion
Na Li
Chuke Wang
Yu Gu
Zhifeng Li
154
0
0
11 Apr 2025
One Quantizer is Enough: Toward a Lightweight Audio Codec
One Quantizer is Enough: Toward a Lightweight Audio Codec
Linwei Zhai
Yunpeng Song
Cui Zhao
Fei Wang
Ge Wang
Wang Zhi
Wei Xi
MQ
94
0
0
07 Apr 2025
Versatile Physics-based Character Control with Hybrid Latent Representation
Versatile Physics-based Character Control with Hybrid Latent Representation
Jinseok Bae
Jungdam Won
Donggeun Lim
I. Hwang
Y. Kim
92
0
0
17 Mar 2025
Vector-Quantized Vision Foundation Models for Object-Centric Learning
Vector-Quantized Vision Foundation Models for Object-Centric Learning
Rongzhen Zhao
V. Wang
Arno Solin
Joni Pajarinen
OCLVLM
565
1
0
27 Feb 2025
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
Ziqiang Liu
Shuangrui Ding
Zhixiong Zhang
Xiaoyi Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Dahua Lin
Jiaqi Wang
134
3
0
18 Feb 2025
Do we really have to filter out random noise in pre-training data for language models?
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
193
6
0
10 Feb 2025
AudioMiXR: Spatial Audio Object Manipulation with 6DoF for Sound Design in Augmented Reality
AudioMiXR: Spatial Audio Object Manipulation with 6DoF for Sound Design in Augmented Reality
Brandon Woodard
Margarita Geleta
Joseph J. LaViola Jr.
Andrea Fanelli
Rhonda Wilson
174
4
0
05 Feb 2025
Grouped Discrete Representation for Object-Centric Learning
Grouped Discrete Representation for Object-Centric Learning
Rongzhen Zhao
V. Wang
Arno Solin
Joni Pajarinen
BDLOCL
88
1
0
04 Nov 2024
APCodec+: A Spectrum-Coding-Based High-Fidelity and
  High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
Hui-Peng Du
Yang Ai
Rui Zheng
Zhen-Hua Ling
60
2
0
30 Oct 2024
Continuous Speech Tokenizer in Text To Speech
Continuous Speech Tokenizer in Text To Speech
Yixing Li
Ruobing Xie
Xingwu Sun
Yu Cheng
Zhanhui Kang
AuLLMCLL
128
2
0
22 Oct 2024
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
Xuyuan Li
Zengqiang Shang
Hua Hua
Peiyang Shi
Chen Yang
Li Wang
Pengyuan Zhang
153
3
0
16 Oct 2024
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Yushen Chen
Zhikang Niu
Ziyang Ma
Keqi Deng
Chunhui Wang
Jian Zhao
Kai Yu
Xie Chen
145
92
0
09 Oct 2024
Recent Advances in Speech Language Models: A Survey
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
208
26
0
01 Oct 2024
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio
  Language Model
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Zhen Ye
Peiwen Sun
Jiahe Lei
Hongzhan Lin
Xu Tan
...
Jianyi Chen
Jiahao Pan
Qifeng Liu
Yike Guo
Wei Xue
AuLLM
72
19
0
30 Aug 2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Shengpeng Ji
Ziyue Jiang
Xize Cheng
Yifu Chen
Minghui Fang
...
Rongjie Huang
Yidi Jiang
Qian Chen
Zhou Zhao
Zhou Zhao
VLM
149
45
0
29 Aug 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
Zeyue Tian
Zhaoyang Liu
Ruibin Yuan
Jiahao Pan
Xiaoqiang Huang
Xu Tan
Xu Tan
Qifeng Chen
Yu Guo
VGen
275
17
0
06 Jun 2024
The Codecfake Dataset and Countermeasures for the Universally Detection
  of Deepfake Audio
The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio
Yuankun Xie
Yi Lu
Ruibo Fu
Zhengqi Wen
Zhiyong Wang
...
Xiaopeng Wang
Yukun Liu
Haonan Cheng
Long Ye
Yi Sun
98
21
0
08 May 2024
Towards audio language modeling -- an overview
Towards audio language modeling -- an overview
Haibin Wu
Xuanjun Chen
Yi-Cheng Lin
Kai-Wei Chang
Ho-Lam Chung
Alexander H. Liu
Hung-yi Lee
AuLLM
110
35
0
20 Feb 2024
Efficient Parallel Audio Generation using Group Masked Language Modeling
Efficient Parallel Audio Generation using Group Masked Language Modeling
Myeonghun Jeong
Minchan Kim
Joun Yeop Lee
Nam Soo Kim
58
6
0
02 Jan 2024
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Dongchao Yang
Jinchuan Tian
Xuejiao Tan
Rongjie Huang
Songxiang Liu
...
Jiang Bian
Xixin Wu
Zhou Zhao
Shinji Watanabe
Helen M. Meng
CVBMAuLLM
135
128
0
01 Oct 2023
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with
  Natural Language Style Prompt
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt
Dongchao Yang
Songxiang Liu
Rongjie Huang
Chao Weng
Helen Meng
DiffMVLM
89
102
0
31 Jan 2023
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion
  Models
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Rongjie Huang
Jia-Bin Huang
Dongchao Yang
Yi Ren
Luping Liu
Mingze Li
Zhenhui Ye
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
DiffM
238
344
0
30 Jan 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
193
727
0
05 Jan 2023
1