Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.09417
Cited By
v1
v2
v3 (latest)
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis
15 June 2023
Shivam Mehta
Siyang Wang
Simon Alexanderson
Jonas Beskow
Éva Székely
G. Henter
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis"
31 / 31 papers shown
Title
Evaluating gesture generation in a large-scale open challenge: The GENEA Challenge 2022
Taras Kucherenko
Pieter Wolfert
Youngwoo Yoon
Carla Viegas
Teodor Nikolov
Mihail Tsakov
G. Henter
55
24
0
15 Mar 2023
DiffMotion: Speech-Driven Gesture Synthesis Using Denoising Diffusion Model
Fan Zhang
Naye Ji
Fuxing Gao
Yongping Li
DiffM
VGen
64
28
0
24 Jan 2023
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
Simbarashe Nyatsanga
Taras Kucherenko
Chaitanya Ahuja
G. Henter
Michael Neff
SLR
65
92
0
13 Jan 2023
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
209
3,750
0
06 Dec 2022
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models
Simon Alexanderson
Rajmund Nagy
Jonas Beskow
G. Henter
DiffM
VGen
70
172
0
17 Nov 2022
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models
Minki Kang
Dong Min
Sung Ju Hwang
DiffM
88
50
0
17 Nov 2022
OverFlow: Putting flows on top of neural transducers for better TTS
Shivam Mehta
Ambika Kirkland
Harm Lameris
Jonas Beskow
Éva Székely
G. Henter
AI4TS
78
13
0
13 Nov 2022
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
Saeed Ghorbani
Ylva Ferstl
Daniel Holden
N. Troje
M. Carbonneau
91
82
0
15 Sep 2022
The GENEA Challenge 2022: A large evaluation of data-driven co-speech gesture generation
Youngwoo Yoon
Pieter Wolfert
Taras Kucherenko
Carla Viegas
Teodor Nikolov
Mihail Tsakov
G. Henter
VGen
62
81
0
22 Aug 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Xu Tan
Jiawei Chen
Haohe Liu
Jian Cong
Chen Zhang
...
Lei He
Frank Soong
Tao Qin
Sheng Zhao
Tie-Yan Liu
109
221
0
09 May 2022
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Heeseung Kim
Sungwon Kim
Sungroh Yoon
DiffM
BDL
84
112
0
23 Nov 2021
Neural HMMs are all you need (for high-quality attention-free TTS)
Shivam Mehta
Éva Székely
Jonas Beskow
G. Henter
64
18
0
30 Aug 2021
Integrated Speech and Gesture Synthesis
Siyang Wang
Simon Alexanderson
Joakim Gustafson
Jonas Beskow
G. Henter
Éva Székely
78
19
0
25 Aug 2021
Passing a Non-verbal Turing Test: Evaluating Gesture Animations Generated from Speech
M. Rebol
Christian Gütl
Krzysztof Pietroszek
SLR
47
24
0
01 Jul 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
118
359
0
29 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
56
88
0
17 Jun 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
107
543
0
13 May 2021
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Myeonghun Jeong
Hyeongju Kim
Sung Jun Cheon
Byoung Jin Choi
N. Kim
DiffM
65
197
0
03 Apr 2021
Generating coherent spontaneous speech and gesture from text
Simon Alexanderson
Éva Székely
G. Henter
Taras Kucherenko
Jonas Beskow
SLR
166
23
0
14 Jan 2021
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffM
SyDa
353
6,586
0
26 Nov 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
179
1,952
0
12 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
161
1,468
0
21 Sep 2020
Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity
Youngwoo Yoon
Bok Cha
Joo-Haeng Lee
Minsu Jang
Jaeyeon Lee
Jaehong Kim
Geehyuk Lee
51
283
0
04 Sep 2020
Let's Face It: Probabilistic Multi-modal Interlocutor-aware Generation of Facial Gestures in Dyadic Settings
Patrik Jonell
Taras Kucherenko
G. Henter
Jonas Beskow
CVBM
71
61
0
11 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
105
497
0
22 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
229
3,160
0
16 May 2020
Gesticulator: A framework for semantically-aware speech-driven gesture generation
Taras Kucherenko
Patrik Jonell
S. V. Waveren
G. Henter
Simon Alexanderson
Iolanda Leite
Hedvig Kjellström
SLR
57
180
0
25 Jan 2020
Generative Modeling by Estimating Gradients of the Data Distribution
Yang Song
Stefano Ermon
SyDa
DiffM
258
3,961
0
12 Jul 2019
Speech waveform synthesis from MFCC sequences with generative adversarial networks
Lauri Juvela
Bajibabu Bollepalli
Xin Wang
Hirokazu Kameoka
Manu Airaksinen
Junichi Yamagishi
P. Alku
49
52
0
03 Apr 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
85
2,704
0
16 Dec 2017
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
406
7,421
0
12 Sep 2016
1