Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ

19 June 2025

Main:4 Pages

4 Figures

Bibliography:1 Pages

1 Tables

Abstract

Residual Vector Quantization (RVQ) has become a dominant approach in neural speech and audio coding, providing high-fidelity compression. However, speech coding presents additional challenges due to real-world noise, which degrades compression efficiency. Standard codecs allocate bits uniformly, wasting bitrate on noise components that do not contribute to intelligibility. This paper introduces a Variable Bitrate RVQ (VRVQ) framework for noise-robust speech coding, dynamically adjusting bitrate per frame to optimize rate-distortion trade-offs. Unlike constant bitrate (CBR) RVQ, our method prioritizes critical speech components while suppressing residual noise. Additionally, we integrate a feature denoiser to further improve noise robustness. Experimental results show that VRVQ improves rate-distortion trade-offs over conventional methods, achieving better compression efficiency and perceptual quality in noisy conditions. Samples are available at our project page:this https URL.

View on arXiv

@article{chae2025_2506.16538,
  title={ Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ },
  author={ Yunkee Chae and Kyogu Lee },
  journal={arXiv preprint arXiv:2506.16538},
  year={ 2025 }
}

Comments on this paper