ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.04466
  4. Cited By
Emotional Speech-driven 3D Body Animation via Disentangled Latent
  Diffusion

Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion

7 December 2023
Kiran Chhatre
Radek Danvevcek
Nikos Athanasiou
Giorgio Becherini
Christopher Peters
Michael J. Black
Timo Bolkart
    DiffM
ArXivPDFHTML

Papers citing "Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion"

50 / 52 papers shown
Title
Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis
Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis
Radek Daněček
Carolin Schmitt
Senya Polikovsky
Michael J. Black
95
1
0
18 Apr 2025
ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE
ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE
Sichun Wu
Kazi Injamamul Haque
Zerrin Yumak
VGen
73
2
0
12 Sep 2024
DEGAS: Detailed Expressions on Full-Body Gaussian Avatars
DEGAS: Detailed Expressions on Full-Body Gaussian Avatars
Zhijing Shao
D. B. Wang
Qing-Yao Tian
Yao-Dong Yang
Hengyu Meng
Zeyu Cai
Bo Dong
Yu Zhang
Kang Zhang
Zhaoxiang Wang
3DGS
77
4
0
20 Aug 2024
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models
Zunnan Xu
Yukang Lin
Haonan Han
Sicheng Yang
Ronghui Li
Yachao Zhang
Xiu Li
Mamba
102
25
0
14 Mar 2024
Diff-TTSG: Denoising probabilistic integrated speech and gesture
  synthesis
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis
Shivam Mehta
Siyang Wang
Simon Alexanderson
Jonas Beskow
Éva Székely
G. Henter
DiffM
45
14
0
15 Jun 2023
EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture
  Generation
EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture Generation
Xingqun Qi
Chen Liu
Lincheng Li
Jie Hou
Haoran Xin
Xin Yu
SLR
69
30
0
30 May 2023
QPGesture: Quantization-Based and Phase-Guided Motion Matching for
  Natural Speech-Driven Gesture Generation
QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation
Sicheng Yang
Zhiyong Wu
Minglei Li
Zhensong Zhang
Lei Hao
Weihong Bao
Hao-Wen Zhuang
SLR
84
46
0
18 May 2023
EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation
EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation
Ziqiao Peng
Hao Wu
Zhenbo Song
Hao-Xuan Xu
Xiangyu Zhu
Jun He
Hongyan Liu
Zhaoxin Fan
CVBM
78
106
0
20 Mar 2023
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
Simbarashe Nyatsanga
Taras Kucherenko
Chaitanya Ahuja
G. Henter
Michael Neff
SLR
59
92
0
13 Jan 2023
Generating Holistic 3D Human Motion from Speech
Generating Holistic 3D Human Motion from Speech
Hongwei Yi
Hualin Liang
Yifei Liu
Qiong Cao
Yandong Wen
Timo Bolkart
Dacheng Tao
Michael J. Black
SLR
68
149
0
08 Dec 2022
Executing your Commands via Motion Diffusion in Latent Space
Executing your Commands via Motion Diffusion in Latent Space
Xin Chen
Biao Jiang
Wen Liu
Zilong Huang
Bin-Bin Fu
Tao Chen
Jingyi Yu
Gang Yu
VGen
DiffM
94
360
0
08 Dec 2022
EDGE: Editable Dance Generation From Music
EDGE: Editable Dance Generation From Music
Jo-Han Tseng
Rodrigo Castellon
Chenxi Liu
91
236
0
19 Nov 2022
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion
  Models
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models
Simon Alexanderson
Rajmund Nagy
Jonas Beskow
G. Henter
DiffM
VGen
70
172
0
17 Nov 2022
Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with
  Hierarchical Neural Embeddings
Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings
Tenglong Ao
Qingzhe Gao
Yuke Lou
Baoquan Chen
Libin Liu
SLR
60
63
0
04 Oct 2022
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
Saeed Ghorbani
Ylva Ferstl
Daniel Holden
N. Troje
M. Carbonneau
73
82
0
15 Sep 2022
TEACH: Temporal Action Composition for 3D Humans
TEACH: Temporal Action Composition for 3D Humans
Nikos Athanasiou
Mathis Petrovich
Michael J. Black
Gül Varol
130
147
0
09 Sep 2022
The GENEA Challenge 2022: A large evaluation of data-driven co-speech
  gesture generation
The GENEA Challenge 2022: A large evaluation of data-driven co-speech gesture generation
Youngwoo Yoon
Pieter Wolfert
Taras Kucherenko
Carla Viegas
Teodor Nikolov
Mihail Tsakov
G. Henter
VGen
57
81
0
22 Aug 2022
BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis
BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis
Davide Moltisanti
Jinyi Wu
Bo Dai
Chen Change Loy
DiffM
56
4
0
20 Jul 2022
GANimator: Neural Motion Synthesis from a Single Sequence
GANimator: Neural Motion Synthesis from a Single Sequence
Peizhuo Li
Kfir Aberman
Zihan Zhang
Rana Hanocka
O. Sorkine-Hornung
GAN
39
33
0
05 May 2022
TEMOS: Generating diverse human motions from textual descriptions
TEMOS: Generating diverse human motions from textual descriptions
Mathis Petrovich
Michael J. Black
Gül Varol
105
387
0
25 Apr 2022
BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for
  Conversational Gestures Synthesis
BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis
Haiyang Liu
Zihao Zhu
Naoya Iwamoto
Yichen Peng
Zhengqing Li
You Zhou
E. Bozkurt
Bo Zheng
SLR
CVBM
57
142
0
10 Mar 2022
Motron: Multimodal Probabilistic Human Motion Forecasting
Motron: Multimodal Probabilistic Human Motion Forecasting
Tim Salzmann
Marco Pavone
Markus Ryll
3DH
72
30
0
08 Mar 2022
Conditional Motion In-betweening
Conditional Motion In-betweening
Jihoon Kim
Taehyun Byun
Seungyoung Shin
Jungdam Won
Sungjoon Choi
56
32
0
09 Feb 2022
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
422
15,515
0
20 Dec 2021
Audio2Gestures: Generating Diverse Gestures from Speech Audio with
  Conditional Variational Autoencoders
Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders
Jing Li
Di Kang
Wenjie Pei
Xuefei Zhe
Ying Zhang
Zhenyu He
Linchao Bao
SLR
69
106
0
15 Aug 2021
Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with
  Generative Adversarial Affective Expression Learning
Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning
Uttaran Bhattacharya
Elizabeth Childs
Nicholas Rewkowski
Tianyi Zhou
SLR
GAN
111
83
0
31 Jul 2021
Action-Conditioned 3D Human Motion Synthesis with Transformer VAE
Action-Conditioned 3D Human Motion Synthesis with Transformer VAE
Mathis Petrovich
Michael J. Black
Gül Varol
ViT
96
503
0
12 Apr 2021
AST: Audio Spectrogram Transformer
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
116
865
0
05 Apr 2021
Improved Denoising Diffusion Probabilistic Models
Improved Denoising Diffusion Probabilistic Models
Alex Nichol
Prafulla Dhariwal
DiffM
337
3,686
0
18 Feb 2021
Learning Speech-driven 3D Conversational Gestures from Video
Learning Speech-driven 3D Conversational Gestures from Video
I. Habibie
Weipeng Xu
Dushyant Mehta
Lingjie Liu
Hans-Peter Seidel
Gerard Pons-Moll
Mohamed A. Elgharib
Christian Theobalt
SLR
CVBM
3DH
78
110
0
13 Feb 2021
Robust Motion In-betweening
Robust Motion In-betweening
Félix G. Harvey
Mike Yurick
Derek Nowrouzezahrai
C. Pal
VGen
73
259
0
09 Feb 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and
  Aggregation
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
166
147
0
02 Feb 2021
Text2Gestures: A Transformer-Based Network for Generating Emotive Body
  Gestures for Virtual Agents
Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gestures for Virtual Agents
Uttaran Bhattacharya
Nicholas Rewkowski
A. Banerjee
P. Guhan
Aniket Bera
Tianyi Zhou
LM&Ro
67
153
0
26 Jan 2021
Training data-efficient image transformers & distillation through
  attention
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
384
6,768
0
23 Dec 2020
Denoising Diffusion Implicit Models
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
278
7,384
0
06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
155
1,457
0
21 Sep 2020
Speech Gesture Generation from the Trimodal Context of Text, Audio, and
  Speaker Identity
Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity
Youngwoo Yoon
Bok Cha
Joo-Haeng Lee
Minsu Jang
Jaeyeon Lee
Jaehong Kim
Geehyuk Lee
46
283
0
04 Sep 2020
Action2Motion: Conditioned Generation of 3D Human Motions
Action2Motion: Conditioned Generation of 3D Human Motions
Chuan Guo
Wei Ji
Sen Wang
Shihao Zou
Qingyao Sun
Annan Deng
Minglun Gong
Li Cheng
68
419
0
30 Jul 2020
Rethinking CNN Models for Audio Classification
Rethinking CNN Models for Audio Classification
Kamalesh Palanisamy
Dipika Singhania
Angela Yao
SSL
67
145
0
22 Jul 2020
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
402
13,048
0
26 May 2020
Generative Tweening: Long-term Inbetweening of 3D Human Motions
Generative Tweening: Long-term Inbetweening of 3D Human Motions
Yi Zhou
Jingwan Lu
Connelly Barnes
Jimei Yang
Sitao Xiang
Hao li
GAN
3DH
61
44
0
18 May 2020
ESResNet: Environmental Sound Classification Based on Visual Domain
  Models
ESResNet: Environmental Sound Classification Based on Visual Domain Models
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
VLM
104
92
0
15 Apr 2020
Learning Individual Styles of Conversational Gesture
Learning Individual Styles of Conversational Gesture
Shiry Ginosar
Amir Bar
Gefen Kohavi
Caroline Chan
Andrew Owens
Jitendra Malik
SLR
45
332
0
10 Jun 2019
MoGlow: Probabilistic and controllable motion synthesis using
  normalising flows
MoGlow: Probabilistic and controllable motion synthesis using normalising flows
G. Henter
Simon Alexanderson
Jonas Beskow
63
98
0
16 May 2019
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
Georgios Pavlakos
Vasileios Choutas
N. Ghorbani
Timo Bolkart
Ahmed A. A. Osman
Dimitrios Tzionas
Michael J. Black
3DH
52
1,717
0
11 Apr 2019
Context-aware Human Motion Prediction
Context-aware Human Motion Prediction
Enric Corona
Albert Pumarola
Guillem Alenyà
Francesc Moreno-Noguer
3DH
90
142
0
06 Apr 2019
On the Continuity of Rotation Representations in Neural Networks
On the Continuity of Rotation Representations in Neural Networks
Yi Zhou
Connelly Barnes
Jingwan Lu
Jimei Yang
Hao Li
3DH
73
1,289
0
17 Dec 2018
Robots Learn Social Skills: End-to-End Learning of Co-Speech Gesture
  Generation for Humanoid Robots
Robots Learn Social Skills: End-to-End Learning of Co-Speech Gesture Generation for Humanoid Robots
Youngwoo Yoon
Woo-Ri Ko
Minsu Jang
Jaeyeon Lee
Jaehong Kim
Geehyuk Lee
SLR
49
231
0
30 Oct 2018
HP-GAN: Probabilistic 3D human motion prediction via GAN
HP-GAN: Probabilistic 3D human motion prediction via GAN
Emad Barsoum
J. Kender
Zicheng Liu
3DH
86
331
0
27 Nov 2017
Arbitrary Style Transfer in Real-time with Adaptive Instance
  Normalization
Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
Xun Huang
Serge J. Belongie
OOD
179
4,364
0
20 Mar 2017
12
Next