ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.09584
  4. Cited By
ViSQOL v3: An Open Source Production Ready Objective Speech and Audio
  Metric

ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric

20 April 2020
Michael Chinen
Felicia S. C. Lim
Jan Skoglund
Nikita Gureev
F. O'Gorman
Andrew Hines
ArXivPDFHTML

Papers citing "ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric"

50 / 71 papers shown
Title
A Streamable Neural Audio Codec with Residual Scalar-Vector Quantization for Real-Time Communication
A Streamable Neural Audio Codec with Residual Scalar-Vector Quantization for Real-Time Communication
Xiao-Hang Jiang
Yang Ai
Rui Zheng
Zhen-Hua Ling
36
0
0
09 Apr 2025
UniSep: Universal Target Audio Separation with Language Models at Scale
UniSep: Universal Target Audio Separation with Language Models at Scale
Yishuo Wang
Hangting Chen
Dongchao Yang
Weiqin Li
Dan Luo
Guangzhi Li
Shan Yang
Zhiyong Wu
Helen Meng
Xixin Wu
VLM
52
1
0
31 Mar 2025
STFTCodec: High-Fidelity Audio Compression through Time-Frequency Domain Representation
STFTCodec: High-Fidelity Audio Compression through Time-Frequency Domain Representation
Tao Feng
Zhiyuan Zhao
Yifan Xie
Yuqi Ye
Xiangyang Luo
Xun Guan
Yongqian Li
57
0
0
21 Mar 2025
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Xue Jiang
Xiulian Peng
Yuan Zhang
Yan-Heng Lu
SSL
88
0
0
15 Mar 2025
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Alexander H. Liu
Sang-gil Lee
Chao-Han Huck Yang
Yuan Gong
Yu-Chun Wang
James Glass
Rafael Valle
Bryan Catanzaro
SSL
60
0
0
02 Mar 2025
SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling
SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling
Shengshi Yao
Jincheng Dai
Xiaoqi Qin
Sixian Wang
Siye Wang
K. Niu
Ping Zhang
38
0
0
22 Jan 2025
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
Xiao-Hang Jiang
Hui-Peng Du
Yang Ai
Ye-Xin Lu
Zhen-Hua Ling
35
0
0
18 Nov 2024
MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High
  Sampling Rate and Low Bitrate Scenarios
MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios
Xiao-Hang Jiang
Yang Ai
Rui Zheng
Hui-Peng Du
Ye-Xin Lu
Zhen-Hua Ling
60
2
0
01 Nov 2024
A Closer Look at Neural Codec Resynthesis: Bridging the Gap between
  Codec and Waveform Generation
A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
Alexander H. Liu
Qirui Wang
Yuan Gong
James Glass
38
0
0
29 Oct 2024
Beyond Correlation: Evaluating Multimedia Quality Models with the
  Constrained Concordance Index
Beyond Correlation: Evaluating Multimedia Quality Models with the Constrained Concordance Index
Alessandro Ragano
H. B. Martinez
Andrew Hines
49
2
0
24 Oct 2024
Non-intrusive Speech Quality Assessment with Diffusion Models Trained on
  Clean Speech
Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech
Danilo de Oliveira
Julius Richter
Jean-Marie Lemercier
Simon Welker
Timo Gerkmann
DiffM
18
2
0
23 Oct 2024
SCOREQ: Speech Quality Assessment with Contrastive Regression
SCOREQ: Speech Quality Assessment with Contrastive Regression
Alessandro Ragano
Jan Skoglund
Andrew Hines
40
6
0
09 Oct 2024
Variable Bitrate Residual Vector Quantization for Audio Coding
Variable Bitrate Residual Vector Quantization for Audio Coding
Yunkee Chae
Woosung Choi
Yuhta Takida
Junghyun Koo
Yukara Ikemiya
...
K. Cheuk
Marco A. Martínez-Ramírez
Kyogu Lee
Wei-Hsiang Liao
Yuki Mitsufuji
91
0
0
08 Oct 2024
Analyzing and Mitigating Inconsistency in Discrete Audio Tokens for
  Neural Codec Language Models
Analyzing and Mitigating Inconsistency in Discrete Audio Tokens for Neural Codec Language Models
Wenrui Liu
Zhifang Guo
Jin Xu
Yuanjun Lv
Yunfei Chu
Zhou Zhao
Junyang Lin
59
1
0
28 Sep 2024
Semi-intrusive audio evaluation: Casting non-intrusive assessment as a multi-modal text prediction task
Semi-intrusive audio evaluation: Casting non-intrusive assessment as a multi-modal text prediction task
Jozef Coldenhoff
Milos Cernak
41
0
0
21 Sep 2024
WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for
  Authenticity Verification
WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
Junzuo Zhou
Jiangyan Yi
Yong Ren
Jianhua Tao
Tao Wang
Chu Yuan Zhang
34
4
0
18 Sep 2024
OpenACE: An Open Benchmark for Evaluating Audio Coding Performance
OpenACE: An Open Benchmark for Evaluating Audio Coding Performance
Jozef Coldenhoff
Niclas Granqvist
Milos Cernak
33
0
0
12 Sep 2024
WaveTransfer: A Flexible End-to-end Multi-instrument Timbre Transfer
  with Diffusion
WaveTransfer: A Flexible End-to-end Multi-instrument Timbre Transfer with Diffusion
Teysir Baoueb
Xiaoyu Bie
Hicham Janati
Gaël Richard
DiffM
26
0
0
06 Sep 2024
Investigating Neural Audio Codecs for Speech Language Model-Based Speech
  Generation
Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation
Jiaqi Li
Dongmei Wang
Xiaofei Wang
Yao Qian
Long Zhou
...
Junkun Chen
Sheng Zhao
Jinyu Li
Zhizheng Wu
Michael Zeng
AuLLM
38
3
0
06 Sep 2024
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Marco Pasini
Stefan Lattner
George Fazekas
24
6
0
12 Aug 2024
SuperCodec: A Neural Speech Codec with Selective Back-Projection Network
SuperCodec: A Neural Speech Codec with Selective Back-Projection Network
Youqiang Zheng
Weiping Tu
Li Xiao
Xinmeng Xu
40
3
0
30 Jul 2024
Enhancing Out-of-Vocabulary Performance of Indian TTS Systems for
  Practical Applications through Low-Effort Data Strategies
Enhancing Out-of-Vocabulary Performance of Indian TTS Systems for Practical Applications through Low-Effort Data Strategies
Srija Anand
Praveena Varadhan
Ashwin Sankar
Giri Raju
Mitesh M. Khapra
45
1
0
18 Jul 2024
On Improving Error Resilience of Neural End-to-End Speech Coders
On Improving Error Resilience of Neural End-to-End Speech Coders
Kishan Gupta
N. Pia
Srikanth Korse
Andreas Brendel
Guillaume Fuchs
M. Multrus
58
0
0
13 Jun 2024
Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate
  Control
Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate Control
Ye-Xin Lu
Yang Ai
Zheng-Yan Sheng
Zhen-Hua Ling
23
1
0
04 Jun 2024
HILCodec: High Fidelity and Lightweight Neural Audio Codec
HILCodec: High Fidelity and Lightweight Neural Audio Codec
S. Ahn
Beom Jun Woo
Mingrui Han
Chanyeong Moon
Nam Soo Kim
34
6
0
08 May 2024
PEAVS: Perceptual Evaluation of Audio-Visual Synchrony Grounded in
  Viewers' Opinion Scores
PEAVS: Perceptual Evaluation of Audio-Visual Synchrony Grounded in Viewers' Opinion Scores
Lucas Goncalves
Prashant Mathur
Chandrashekhar Lavania
Metehan Cekic
Marcello Federico
Kyu J. Han
22
4
0
10 Apr 2024
Gull: A Generative Multifunctional Audio Codec
Gull: A Generative Multifunctional Audio Codec
Yi Luo
Jianwei Yu
Hangting Chen
Rongzhi Gu
Chao Weng
AuLLM
46
3
0
07 Apr 2024
Dynamic Switch Layers For Unsupervised Learning
Dynamic Switch Layers For Unsupervised Learning
Haiguang Li
Usama Pervaiz
Michal Matuszak
Robert Kamara
Gilles Roux
T. Thormundsson
Joseph Antognini
52
1
0
05 Apr 2024
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot
  Text-to-Speech
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech
Jaehyeon Kim
Keon Lee
Seungjun Chung
Jaewoong Cho
74
41
0
03 Apr 2024
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Peng Liu
Dongyang Dai
Zhiyong Wu
46
2
0
08 Mar 2024
Self-Supervised Speech Quality Estimation and Enhancement Using Only
  Clean Speech
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
Szu-Wei Fu
Kuo-Hsuan Hung
Yu Tsao
Yu-Chiang Frank Wang
SSL
27
11
0
26 Feb 2024
APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum
  Encoding and Decoding
APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding
Yang Ai
Xiao-Hang Jiang
Ye-Xin Lu
Hui-Peng Du
Zhenhua Ling
26
20
0
16 Feb 2024
An Intra-BRNN and GB-RVQ Based END-TO-END Neural Audio Codec
An Intra-BRNN and GB-RVQ Based END-TO-END Neural Audio Codec
Linping Xu
Jiawei Jiang
Dejun Zhang
Xianjun Xia
Li Chen
Yijian Xiao
Piao Ding
Shenyi Song
Sixing Yin
Ferdous Sohel
37
6
0
02 Feb 2024
Towards High-Quality and Efficient Speech Bandwidth Extension with
  Parallel Amplitude and Phase Prediction
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
Ye-Xin Lu
Yang Ai
Hui-Peng Du
Zhenhua Ling
30
6
0
12 Jan 2024
DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
Jiarui Hai
Helin Wang
Dongchao Yang
Karan Thakkar
Najim Dehak
Mounya Elhilali
DiffM
31
7
0
06 Oct 2023
NOMAD: Unsupervised Learning of Perceptual Embeddings for Speech
  Enhancement and Non-matching Reference Audio Quality Assessment
NOMAD: Unsupervised Learning of Perceptual Embeddings for Speech Enhancement and Non-matching Reference Audio Quality Assessment
Alessandro Ragano
Jan Skoglund
Andrew Hines
25
9
0
28 Sep 2023
Optimization Techniques for a Physical Model of Human Vocalisation
Optimization Techniques for a Physical Model of Human Vocalisation
Mateo Cámara
Zhiyuan Xu
Yi-Chen Zong
José-Luis Blanco
Joshua D. Reiss
19
3
0
26 Sep 2023
Fewer-token Neural Speech Codec with Time-invariant Codes
Fewer-token Neural Speech Codec with Time-invariant Codes
Yong Ren
Tao Wang
Jiangyan Yi
Le Xu
Jianhua Tao
Chuyuan Zhang
Jun Zhou
22
33
0
15 Sep 2023
Rep2wav: Noise Robust text-to-speech Using self-supervised
  representations
Rep2wav: Noise Robust text-to-speech Using self-supervised representations
Qiu-shi Zhu
Yunting Gu
Rilin Chen
Chao Weng
Yuchen Hu
Lirong Dai
Jie Zhang
AI4TS
53
3
0
28 Aug 2023
TokenSplit: Using Discrete Speech Representations for Direct, Refined,
  and Transcript-Conditioned Speech Separation and Recognition
TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition
Hakan Erdogan
Scott Wisdom
Xuankai Chang
Zalan Borsos
Marco Tagliasacchi
Neil Zeghidour
J. Hershey
21
9
0
21 Aug 2023
Explicit Estimation of Magnitude and Phase Spectra in Parallel for
  High-Quality Speech Enhancement
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
Ye-Xin Lu
Yang Ai
Zhenhua Ling
30
8
0
17 Aug 2023
AudioVMAF: Audio Quality Prediction with VMAF
AudioVMAF: Audio Quality Prediction with VMAF
A. Biswas
H. Mundt
13
2
0
07 Aug 2023
From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion
From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion
Robin San Roman
Yossi Adi
Antoine Deleforge
Romain Serizel
Gabriel Synnaeve
Alexandre Défossez
DiffM
27
21
0
02 Aug 2023
CQNV: A combination of coarsely quantized bitstream and neural vocoder
  for low rate speech coding
CQNV: A combination of coarsely quantized bitstream and neural vocoder for low rate speech coding
Youqiang Zheng
Li Xiao
Weiping Tu
Yuhong Yang
Xinmeng Xu
41
6
0
25 Jul 2023
An Improved Metric of Informational Masking for Perceptual Audio Quality
  Measurement
An Improved Metric of Informational Masking for Perceptual Audio Quality Measurement
Pablo M. Delgado
Jürgen Herre
16
0
0
13 Jul 2023
Siamese SIREN: Audio Compression with Implicit Neural Representations
Siamese SIREN: Audio Compression with Implicit Neural Representations
Luca A. Lanzendörfer
Roger Wattenhofer
32
9
0
22 Jun 2023
High-Fidelity Audio Compression with Improved RVQGAN
High-Fidelity Audio Compression with Improved RVQGAN
Rithesh Kumar
Prem Seetharaman
Alejandro Luebs
I. Kumar
Kundan Kumar
56
290
0
11 Jun 2023
Vocos: Closing the gap between time-domain and Fourier-based neural
  vocoders for high-quality audio synthesis
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Hubert Siuzdak
37
79
0
01 Jun 2023
What You Hear Is What You See: Audio Quality Metrics From Image Quality
  Metrics
What You Hear Is What You See: Audio Quality Metrics From Image Quality Metrics
Tashi Namgyal
Alexander Hepburn
Raúl Santos-Rodríguez
Valero Laparra
Jesús Malo
27
1
0
19 May 2023
Privacy in Speech Technology
Privacy in Speech Technology
Tomas Bäckström
32
4
0
09 May 2023
12
Next