Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.15197
Cited By
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
21 May 2025
Pinxin Liu
Haiyang Liu
Luchuan Song
Chenliang Xu
SLR
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Intentional Gesture: Deliver Your Intentions with Gestures for Speech"
18 / 18 papers shown
Title
MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness
Yunlong Tang
Pinxin Liu
Mingqian Feng
Zhangyun Tan
Rui Mao
...
Hang Hua
Ali Vosoughi
Luchuan Song
Zeliang Zhang
Chenliang Xu
LRM
46
0
0
26 May 2025
Contextual Gesture: Co-Speech Gesture Video Generation through Context-aware Gesture Representation
Pinxin Liu
Pengfei Zhang
Hyeongwoo Kim
Pablo Garrido
Ari Sharpio
Kyle Olszewski
SLR
58
5
0
11 Feb 2025
Generative AI for Cel-Animation: A Survey
Yunlong Tang
Junjia Guo
Pinxin Liu
Zhiyuan Wang
Hang Hua
...
Jing Bi
Mingqian Feng
Xuzhao Li
Zeliang Zhang
Chenliang Xu
VGen
133
7
0
08 Jan 2025
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
OCL
157
101
0
09 Oct 2024
DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation
Junming Chen
Yunfei Liu
Jianan Wang
Ailing Zeng
Yu Li
Qifeng Chen
VGen
70
32
0
09 Jan 2024
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Max Bain
Jaesung Huh
Tengda Han
Andrew Zisserman
103
241
0
01 Mar 2023
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
118
2,418
0
19 Dec 2022
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
77
248
0
10 Dec 2022
Generating Holistic 3D Human Motion from Speech
Hongwei Yi
Hualin Liang
Yifei Liu
Qiong Cao
Yandong Wen
Timo Bolkart
Dacheng Tao
Michael J. Black
SLR
75
149
0
08 Dec 2022
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
Xian Liu
Qianyi Wu
Hang Zhou
Yinghao Xu
Rui Qian
Xinyi Lin
Xiaowei Zhou
Wayne Wu
Bo Dai
Bolei Zhou
SLR
84
105
0
24 Mar 2022
BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis
Haiyang Liu
Zihao Zhu
Naoya Iwamoto
Yichen Peng
Zhengqing Li
You Zhou
E. Bozkurt
Bo Zheng
SLR
CVBM
63
142
0
10 Mar 2022
MaskGIT: Masked Generative Image Transformer
Huiwen Chang
Han Zhang
Lu Jiang
Ce Liu
William T. Freeman
ViT
153
695
0
08 Feb 2022
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
485
15,734
0
20 Dec 2021
Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders
Jing Li
Di Kang
Wenjie Pei
Xuefei Zhe
Ying Zhang
Zhenyu He
Linchao Bao
SLR
75
106
0
15 Aug 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
670
41,430
0
22 Oct 2020
Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity
Youngwoo Yoon
Bok Cha
Joo-Haeng Lee
Minsu Jang
Jaeyeon Lee
Jaehong Kim
Geehyuk Lee
51
283
0
04 Sep 2020
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
230
5,071
0
02 Nov 2017
An Introduction to Convolutional Neural Networks
K. O’Shea
Ryan Nash
FaML
HAI
82
3,159
0
26 Nov 2015
1