ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.09119
  4. Cited By
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

16 March 2023
Lingting Zhu
Xian Liu
Xuanyu Liu
Rui Qian
Ziwei Liu
Lequan Yu
ArXivPDFHTML

Papers citing "Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation"

50 / 86 papers shown
Title
Text-driven Motion Generation: Overview, Challenges and Directions
Text-driven Motion Generation: Overview, Challenges and Directions
Ali Rida Sahili
Najett Neji
Hedi Tabia
VGen
38
0
0
14 May 2025
Inter-Diffusion Generation Model of Speakers and Listeners for Effective Communication
Inter-Diffusion Generation Model of Speakers and Listeners for Effective Communication
Jinhe Huang
Yongkang Cheng
Yuming Hang
Gaoge Han
J. Li
Jing Zhang
Xingjian Gu
43
0
0
08 May 2025
Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Co3^{3}3Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Xingqun Qi
Yatian Wang
Hengyuan Zhang
J. Pan
Wei Xue
Shanghang Zhang
Wenhan Luo
Qifeng Liu
Yike Guo
SLR
66
0
0
03 May 2025
GENMO: A GENeralist Model for Human MOtion
GENMO: A GENeralist Model for Human MOtion
Jiefeng Li
Jinkun Cao
Haotian Zhang
Davis Rempe
Jan Kautz
Umar Iqbal
Ye Yuan
DiffM
VGen
56
1
0
02 May 2025
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
Xiangyue Zhang
Jianfang Li
Jiaxu Zhang
Jianqiang Ren
Liefeng Bo
Zhigang Tu
30
0
0
12 Apr 2025
EasyGenNet: An Efficient Framework for Audio-Driven Gesture Video Generation Based on Diffusion Model
EasyGenNet: An Efficient Framework for Audio-Driven Gesture Video Generation Based on Diffusion Model
Renda Li
Xiaohua Qi
Q. Ling
Jun Yu
Ziyi Chen
Peng Chang
Mei HanJing Xiao
DiffM
VGen
48
0
0
11 Apr 2025
ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer
ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer
Yong Xie
Yunlian Sun
Hongwen Zhang
Y. Liu
Jinhui Tang
VGen
98
0
0
27 Mar 2025
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Jiazhi Guan
Kaisiyuan Wang
Zhiliang Xu
Quanwei Yang
Yasheng Sun
...
Errui Ding
J. Wang
Youjian Zhao
Hang Zhou
Ziwei Liu
VGen
44
0
0
25 Mar 2025
DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from Speech
DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from Speech
Yongkang Cheng
Shaoli Huang
Xuelin Chen
J. Ning
Biwei Huang
DiffM
55
1
0
21 Mar 2025
MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization
MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization
Binjie Liu
Lina Liu
Sanyi Zhang
Songen Gu
Yihao Zhi
Tianyi Zhu
Lei Yang
Long Ye
SLR
76
0
0
18 Mar 2025
ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation
ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation
Ling-an Zeng
Guohong Huang
Yi-Lin Wei
Shengbo Gu
Yu-Ming Tang
Jingke Meng
Wei-Shi Zheng
59
2
0
17 Mar 2025
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
Yasheng Sun
Zhiliang Xu
Hang Zhou
Jiazhi Guan
Quanwei Yang
...
Yingying Li
Haocheng Feng
J. Wang
Ziwei Liu
Koike Hideki
VGen
61
0
0
13 Mar 2025
Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion
Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion
Evgeniia Vu
Andrei Boiarov
Dmitry Vetrov
VGen
50
0
0
13 Mar 2025
HERO: Human Reaction Generation from Videos
Chengjun Yu
Wei-dong Zhai
Yuhang Yang
Yang Cao
Zheng-jun Zha
VGen
56
0
0
11 Mar 2025
ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis
ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis
Xukun Zhou
Fengxin Li
Ming Chen
Yan Zhou
Pengfei Wan
Di Zhang
Yeying Jin
Zhaoxin Fan
Hongyan Liu
Jun He
DiffM
VGen
51
0
0
09 Mar 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
92
1
0
06 Mar 2025
HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation
Hongye Cheng
Tianyu Wang
Guangsi Shi
Zexing Zhao
Yanwei Fu
SLR
47
1
0
03 Mar 2025
BGM2Pose: Active 3D Human Pose Estimation with Non-Stationary Sounds
Yuto Shibata
Yusuke Oumi
Go Irie
Akisato Kimura
Yoshimitsu Aoki
Mariko Isogawa
29
0
0
01 Mar 2025
ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model
ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model
Xuangeng Chu
Nabarun Goswami
Ziteng Cui
Hanqin Wang
Tatsuya Harada
DiffM
80
0
0
27 Feb 2025
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Gaojie Lin
Jianwen Jiang
Jiaqi Yang
Zerong Zheng
Chao Liang
DiffM
VGen
185
11
0
03 Feb 2025
Joint Co-Speech Gesture and Expressive Talking Face Generation using
  Diffusion with Adapters
Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters
S. Hogue
Chenxu Zhang
Yapeng Tian
Xiaohu Guo
DiffM
76
0
0
18 Dec 2024
SyncViolinist: Music-Oriented Violin Motion Generation Based on Bowing
  and Fingering
SyncViolinist: Music-Oriented Violin Motion Generation Based on Bowing and Fingering
Hiroki Nishizawa
Keitaro Tanaka
Asuka Hirata
Shugo Yamaguchi
Qi Feng
Masatoshi Hamanaka
Shigeo Morishima
67
0
0
11 Dec 2024
OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Hui Li
Mingwang Xu
Yun Zhan
Shan Mu
Jiaye Li
...
Y. Chen
Tan Chen
Mao Ye
Jingdong Wang
Siyu Zhu
VGen
102
2
0
28 Nov 2024
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
75
0
0
24 Nov 2024
MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and
  Correspondence
MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence
Fuming You
Minghui Fang
Li Tang
Rongjie Huang
Yongqi Wang
Zhou Zhao
18
2
0
04 Nov 2024
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided
  Mixture-of-Experts
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts
Xiang Deng
Youxin Pang
Xiaochen Zhao
Chao Xu
Lizhen Wang
Hongjiang Xiao
Shi Yan
Hongwen Zhang
Yebin Liu
DiffM
VGen
40
1
0
31 Oct 2024
Towards a GENEA Leaderboard -- an Extended, Living Benchmark for
  Evaluating and Advancing Conversational Motion Synthesis
Towards a GENEA Leaderboard -- an Extended, Living Benchmark for Evaluating and Advancing Conversational Motion Synthesis
Rajmund Nagy
Hendric Voss
Youngwoo Yoon
Taras Kucherenko
Teodor Nikolov
Thanh Hoang-Minh
R. Mcdonnell
Stefan Kopp
Michael Neff
G. Henter
31
1
0
08 Oct 2024
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio
  Motion Embedding and Diffusion Interpolation
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation
Haiyang Liu
Xingchao Yang
Tomoya Akiyama
Yuantian Huang
Qiaoge Li
Shigeru Kuriyama
Takafumi Taketomi
VGen
SLR
22
7
0
05 Oct 2024
Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion
  Generation
Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion Generation
Bohong Chen
Yumeng Li
Yao-Xiang Ding
Tianjia Shao
Kun Zhou
37
7
0
01 Oct 2024
Generation of Complex 3D Human Motion by Temporal and Spatial
  Composition of Diffusion Models
Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models
Lorenzo Mandelli
Stefano Berretti
DiffM
39
2
0
18 Sep 2024
2D or not 2D: How Does the Dimensionality of Gesture Representation
  Affect 3D Co-Speech Gesture Generation?
2D or not 2D: How Does the Dimensionality of Gesture Representation Affect 3D Co-Speech Gesture Generation?
Teo Guichoux
Laure Soulier
Nicolas Obin
Catherine Pelachaud
SLR
32
0
0
16 Sep 2024
DiffTED: One-shot Audio-driven TED Talk Video Generation with
  Diffusion-based Co-speech Gestures
DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures
S. Hogue
Chenxu Zhang
Hamza Daruger
Yapeng Tian
Xiaohu Guo
VGen
43
10
0
11 Sep 2024
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention
Gaojie Lin
Jianwen Jiang
Chao Liang
Tianyun Zhong
Jiaqi Yang
Yanbo Zheng
VGen
DiffM
66
13
0
03 Sep 2024
Combo: Co-speech holistic 3D human motion generation and efficient
  customizable adaptation in harmony
Combo: Co-speech holistic 3D human motion generation and efficient customizable adaptation in harmony
Chao Xu
Mingze Sun
Zhi-Qi Cheng
Fei-Yue Wang
Yang Liu
Baigui Sun
Ruqi Huang
Alexander G. Hauptmann
VGen
45
2
0
18 Aug 2024
MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture
  Generation
MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Xiaofeng Mao
Zhengkai Jiang
Qilin Wang
Chencan Fu
Jiangning Zhang
Jiafu Wu
Yabiao Wang
Chengjie Wang
Wei Li
Mingmin Chi
80
4
0
06 Aug 2024
MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and
  Disentangled Multi-Modality Fusion
MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and Disentangled Multi-Modality Fusion
Chencan Fu
Yabiao Wang
Jiangning Zhang
Zhengkai Jiang
Xiaofeng Mao
Jiafu Wu
Weijian Cao
Chengjie Wang
Yanhao Ge
Yong Liu
Mamba
43
2
0
29 Jul 2024
Investigating the impact of 2D gesture representation on co-speech
  gesture generation
Investigating the impact of 2D gesture representation on co-speech gesture generation
Teo Guichoux
Laure Soulier
Nicolas Obin
Catherine Pelachaud
SLR
19
0
0
21 Jun 2024
Holistic-Motion2D: Scalable Whole-body Human Motion Generation in 2D
  Space
Holistic-Motion2D: Scalable Whole-body Human Motion Generation in 2D Space
Yuan Wang
Zhao Wang
Junhao Gong
Di Huang
Tong He
...
J. Jiao
Xuetao Feng
Qi Dou
Shixiang Tang
Dan Xu
46
3
0
17 Jun 2024
CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild
CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild
Xingqun Qi
Hengyuan Zhang
Yatian Wang
J. Pan
Chen Liu
...
Qixun Zhang
Shanghang Zhang
Wenhan Luo
Qifeng Liu
Qi-fei Liu
DiffM
SLR
110
5
0
27 May 2024
SIGGesture: Generalized Co-Speech Gesture Synthesis via Semantic
  Injection with Large-Scale Pre-Training Diffusion Models
SIGGesture: Generalized Co-Speech Gesture Synthesis via Semantic Injection with Large-Scale Pre-Training Diffusion Models
Qingrong Cheng
Xu Li
Xinghui Fu
DiffM
31
2
0
22 May 2024
Fake it to make it: Using synthetic data to remedy the data shortage in
  joint multimodal speech-and-gesture synthesis
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis
Shivam Mehta
Anna Deichler
Jim O'Regan
Birger Moëll
Jonas Beskow
G. Henter
Simon Alexanderson
46
4
0
30 Apr 2024
Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued
  Speech Gesture Generation with Diffusion Model
Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued Speech Gesture Generation with Diffusion Model
Wen-Ling Lei
Li Liu
Jun Wang
DiffM
29
2
0
30 Apr 2024
in2IN: Leveraging individual Information to Generate Human INteractions
in2IN: Leveraging individual Information to Generate Human INteractions
Pablo Ruiz-Ponce
Germán Barquero
Cristina Palmero
Sergio Escalera
Jose J. García Rodríguez
VGen
DiffM
51
7
0
15 Apr 2024
A Unified Editing Method for Co-Speech Gesture Generation via Diffusion
  Inversion
A Unified Editing Method for Co-Speech Gesture Generation via Diffusion Inversion
Zeyu Zhao
Nan Gao
Zhi Zeng
Guixuan Zhang
Jie Liu
Shuwu Zhang
DiffM
41
0
0
03 Apr 2024
Towards Variable and Coordinated Holistic Co-Speech Motion Generation
Towards Variable and Coordinated Holistic Co-Speech Motion Generation
Yifei Liu
Qiong Cao
Yandong Wen
Huaiguang Jiang
Changxing Ding
SLR
63
13
0
30 Mar 2024
Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for
  Communication
Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication
Mingze Sun
Chao Xu
Xinyu Jiang
Yang Liu
Baigui Sun
Ruqi Huang
49
3
0
28 Mar 2024
ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture
  Synthesis
ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis
Muhammad Hamza Mughal
Rishabh Dabral
I. Habibie
Lucia Donatelli
Marc Habermann
Christian Theobalt
SLR
38
15
0
26 Mar 2024
CoMo: Controllable Motion Generation through Language Guided Pose Code
  Editing
CoMo: Controllable Motion Generation through Language Guided Pose Code Editing
Yiming Huang
Weilin Wan
Yue Yang
Chris Callison-Burch
Mark Yatskar
Lingjie Liu
39
22
0
20 Mar 2024
Generative Enhancement for 3D Medical Images
Generative Enhancement for 3D Medical Images
Lingting Zhu
Noel Codella
Dongdong Chen
Zhenchao Jin
Lu Yuan
Lequan Yu
DiffM
MedIm
42
10
0
19 Mar 2024
Speech-driven Personalized Gesture Synthetics: Harnessing Automatic
  Fuzzy Feature Inference
Speech-driven Personalized Gesture Synthetics: Harnessing Automatic Fuzzy Feature Inference
Fan Zhang
Zhaohan Wang
Xin Lyu
Siyuan Zhao
Mengjian Li
...
Naye Ji
Hui Du
Fuxing Gao
Hao Wu
Shunman Li
VGen
43
3
0
16 Mar 2024
12
Next