Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.16874
Cited By
v1
v2
v3 (latest)
CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild
27 May 2024
Xingqun Qi
Hengyuan Zhang
Yatian Wang
J. Pan
Chen Liu
Peng Li
Xiaowei Chi
Mengfei Li
Qixun Zhang
Shanghang Zhang
Wenhan Luo
Qifeng Liu
Qi-fei Liu
DiffM
SLR
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild"
50 / 54 papers shown
Title
PhysiInter: Integrating Physical Mapping for High-Fidelity Human Interaction Generation
Wei Yao
Yunlian Sun
Chang Liu
Hongwen Zhang
Jinhui Tang
26
0
0
09 Jun 2025
Co
3
^{3}
3
Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Xingqun Qi
Yatian Wang
Hengyuan Zhang
J. Pan
Wei Xue
Shanghang Zhang
Wenhan Luo
Qifeng Liu
Yike Guo
SLR
131
0
0
03 May 2025
VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Shiying Li
Xingqun Qi
Bingkun Yang
Chen Weile
Zezhao Tian
Muyi Sun
Qifeng Liu
Man Zhang
Zhenan Sun
122
0
0
30 Apr 2025
ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer
Yong Xie
Yunlian Sun
Hongwen Zhang
Yebin Liu
Jinhui Tang
VGen
149
0
0
27 Mar 2025
SIGGesture: Generalized Co-Speech Gesture Synthesis via Semantic Injection with Large-Scale Pre-Training Diffusion Models
Qingrong Cheng
Xu Li
Xinghui Fu
DiffM
85
2
0
22 May 2024
Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention
Peng Li
Yuan Liu
Xiaoxiao Long
Feihu Zhang
Cheng Lin
...
Wenhan Luo
Ping Tan
Wenping Wang
Qi-fei Liu
Yi-Ting Guo
VGen
150
51
0
19 May 2024
Towards Variable and Coordinated Holistic Co-Speech Motion Generation
Yifei Liu
Qiong Cao
Yandong Wen
Huaiguang Jiang
Changxing Ding
SLR
122
17
0
30 Mar 2024
EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Linrui Tian
Qi Wang
Bang Zhang
Liefeng Bo
DiffM
127
126
0
27 Feb 2024
DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation
Junming Chen
Yunfei Liu
Jianan Wang
Ailing Zeng
Yu Li
Qifeng Chen
VGen
105
32
0
09 Jan 2024
OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers
Hanming Liang
Jiacheng Bao
Ruichi Zhang
Sihan Ren
Yuecheng Xu
Sibei Yang
Xin Chen
Jingyi Yu
Lan Xu
102
26
0
14 Dec 2023
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
Xingqun Qi
Jiahao Pan
Peng Li
Ruibin Yuan
Xiaowei Chi
...
Wenhan Luo
Wei Xue
Shanghang Zhang
Qi-fei Liu
Yi-Ting Guo
SLR
103
13
0
29 Nov 2023
From Sparse to Soft Mixtures of Experts
J. Puigcerver
C. Riquelme
Basil Mustafa
N. Houlsby
MoE
201
130
0
02 Aug 2023
Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics
Chen Liu
Peike Li
Xingqun Qi
Hu Zhang
Lincheng Li
Dadong Wang
Xin Yu
VOS
91
34
0
31 Jul 2023
EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture Generation
Xingqun Qi
Chen Liu
Lincheng Li
Jie Hou
Haoran Xin
Xin Yu
SLR
93
30
0
30 May 2023
Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models
Sheng Shen
Le Hou
Yan-Quan Zhou
Nan Du
Shayne Longpre
...
Vincent Zhao
Hongkun Yu
Kurt Keutzer
Trevor Darrell
Denny Zhou
ALM
MoE
107
60
0
24 May 2023
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Zhifu Gao
Zerui Li
Jiaming Wang
Haoneng Luo
Xian Shi
...
Yabin Li
Lingyun Zuo
Zhihao Du
Zhangyu Xiao
Shiliang Zhang
91
67
0
18 May 2023
DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models
Sicheng Yang
Zhiyong Wu
Minglei Li
Zhensong Zhang
Lei Hao
Weihong Bao
Ming Cheng
Long Xiao
76
71
0
08 May 2023
GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
Tenglong Ao
Zeyi Zhang
Libin Liu
DiffM
VGen
144
152
0
26 Mar 2023
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Lingting Zhu
Xian Liu
Xuanyu Liu
Rui Qian
Ziwei Liu
Lequan Yu
85
120
0
16 Mar 2023
Scaling Vision-Language Models with Sparse Mixture of Experts
Sheng Shen
Z. Yao
Chunyuan Li
Trevor Darrell
Kurt Keutzer
Yuxiong He
VLM
MoE
77
68
0
13 Mar 2023
Diverse 3D Hand Gesture Prediction from Body Dynamics by Bilateral Hand Disentanglement
Xingqun Qi
Chen Liu
Muyi Sun
Lincheng Li
Changjie Fan
Xin Yu
SLR
111
15
0
03 Mar 2023
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Max Bain
Jaesung Huh
Tengda Han
Andrew Zisserman
151
243
0
01 Mar 2023
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
384
4,198
1
10 Feb 2023
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
178
2,441
0
19 Dec 2022
MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
Rishabh Dabral
Muhammad Hamza Mughal
Vladislav Golyanik
Christian Theobalt
DiffM
VGen
111
183
0
08 Dec 2022
Generating Holistic 3D Human Motion from Speech
Hongwei Yi
Hualin Liang
Yifei Liu
Qiong Cao
Yandong Wen
Timo Bolkart
Dacheng Tao
Michael J. Black
SLR
105
151
0
08 Dec 2022
Executing your Commands via Motion Diffusion in Latent Space
Xin Chen
Biao Jiang
Wen Liu
Zilong Huang
Bin-Bin Fu
Tao Chen
Jingyi Yu
Gang Yu
VGen
DiffM
205
366
0
08 Dec 2022
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
230
3,780
0
06 Dec 2022
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts
Trevor Gale
Deepak Narayanan
C. Young
Matei A. Zaharia
MoE
81
109
0
29 Nov 2022
Safe Real-World Autonomous Driving by Learning to Predict and Plan with a Mixture of Experts
S. Pini
C. Perone
Aayush Ahuja
Ana Ferreira
Moritz Niendorf
Sergey Zagoruyko
89
38
0
03 Nov 2022
Human Motion Diffusion Model
Guy Tevet
Sigal Raab
Brian Gordon
Yonatan Shafir
Daniel Cohen-Or
Amit H. Bermano
DiffM
VGen
287
771
0
29 Sep 2022
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
Saeed Ghorbani
Ylva Ferstl
Daniel Holden
N. Troje
M. Carbonneau
123
83
0
15 Sep 2022
CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation
Zhihao Li
Jianzhuang Liu
Zhensong Zhang
Songcen Xu
Youliang Yan
3DH
143
225
0
01 Aug 2022
PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular Images
Hongwen Zhang
Yating Tian
Yuxiang Zhang
Mengcheng Li
Liang An
Zhenan Sun
Yebin Liu
3DH
126
147
0
13 Jul 2022
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
Xian Liu
Qianyi Wu
Hang Zhou
Yinghao Xu
Rui Qian
Xinyi Lin
Xiaowei Zhou
Wayne Wu
Bo Dai
Bolei Zhou
SLR
112
105
0
24 Mar 2022
BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis
Haiyang Liu
Zihao Zhu
Naoya Iwamoto
Yichen Peng
Zhengqing Li
You Zhou
E. Bozkurt
Bo Zheng
SLR
CVBM
110
144
0
10 Mar 2022
SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos
Ailing Zeng
Lei Yang
Xu Ju
Jiefeng Li
Jianyi Wang
Qiang Xu
3DH
92
72
0
27 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
617
15,859
0
20 Dec 2021
Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning
Uttaran Bhattacharya
Elizabeth Childs
Nicholas Rewkowski
Tianyi Zhou
SLR
GAN
143
83
0
31 Jul 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
1.1K
30,111
0
26 Feb 2021
Improved Denoising Diffusion Probabilistic Models
Alex Nichol
Prafulla Dhariwal
DiffM
359
3,747
0
18 Feb 2021
Learning Speech-driven 3D Conversational Gestures from Video
I. Habibie
Weipeng Xu
Dushyant Mehta
Lingjie Liu
Hans-Peter Seidel
Gerard Pons-Moll
Mohamed A. Elgharib
Christian Theobalt
SLR
CVBM
3DH
94
111
0
13 Feb 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
160
2,248
0
11 Jan 2021
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
334
7,539
0
06 Oct 2020
Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity
Youngwoo Yoon
Bok Cha
Joo-Haeng Lee
Minsu Jang
Jaeyeon Lee
Jaehong Kim
Geehyuk Lee
79
284
0
04 Sep 2020
Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional-Mixture Approach
Chaitanya Ahuja
Dong Won Lee
Y. Nakano
Louis-Philippe Morency
51
106
0
24 Jul 2020
In defence of metric learning for speaker recognition
Joon Son Chung
Jaesung Huh
Seongkyu Mun
Minjae Lee
Hee-Soo Heo
Soyeon Choe
Chiheon Ham
Sung-Ye Jung
Bong-Jin Lee
Icksang Han
77
438
0
26 Mar 2020
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
Georgios Pavlakos
Vasileios Choutas
N. Ghorbani
Timo Bolkart
Ahmed A. A. Osman
Dimitrios Tzionas
Michael J. Black
3DH
104
1,730
0
11 Apr 2019
3D Hand Shape and Pose from Images in the Wild
A. Boukhayma
Rodrigo de Bem
Philip Torr
3DH
97
356
0
09 Feb 2019
On the Continuity of Rotation Representations in Neural Networks
Yi Zhou
Connelly Barnes
Jingwan Lu
Jimei Yang
Hao Li
3DH
99
1,298
0
17 Dec 2018
1
2
Next