ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.16124
  4. Cited By
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D
  Talking Face Generation

AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation

25 February 2024
Yasheng Sun
Wenqing Chu
Hang Zhou
Kaisiyuan Wang
Hideki Koike
ArXivPDFHTML

Papers citing "AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation"

20 / 20 papers shown
Title
Modular Conversational Agents for Surveys and Interviews
Modular Conversational Agents for Surveys and Interviews
Jiangbo Yu
Jinhua Zhao
Luis Miranda-Moreno
Matthew Korp
128
0
0
22 Dec 2024
EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation
EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation
Ziqiao Peng
Hao Wu
Zhenbo Song
Hao-Xuan Xu
Xiangyu Zhu
Jun He
Hongyan Liu
Zhaoxin Fan
CVBM
63
104
0
20 Mar 2023
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in
  Transformers
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers
Yasheng Sun
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Zhibin Hong
Jingtuo Liu
Errui Ding
Jingdong Wang
Ziwei Liu
Koike Hideki
46
34
0
09 Dec 2022
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion
  Priors
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
Zhentao Yu
Zixin Yin
Deyu Zhou
Duomin Wang
Finn Wong
Baoyuan Wang
DiffM
52
37
0
07 Dec 2022
SPACE: Speech-driven Portrait Animation with Controllable Expression
SPACE: Speech-driven Portrait Animation with Controllable Expression
Francesco Ferroni
Arun Mallya
Ting-Chun Wang
Rafael Valle
Xuan Li
VGen
49
47
0
17 Nov 2022
Human Motion Diffusion Model
Human Motion Diffusion Model
Guy Tevet
Sigal Raab
Brian Gordon
Yonatan Shafir
Daniel Cohen-Or
Amit H. Bermano
DiffM
VGen
250
737
0
29 Sep 2022
Emotion-Controllable Generalized Talking Face Generation
Emotion-Controllable Generalized Talking Face Generation
Sanjana Sinha
S. Biswas
Ravindra Yadav
Brojeshwar Bhowmick
CVBM
36
51
0
02 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
283
3,458
0
29 Apr 2022
EMOCA: Emotion Driven Monocular Face Capture and Animation
EMOCA: Emotion Driven Monocular Face Capture and Animation
Radek Daněček
Michael J. Black
Timo Bolkart
CVBM
3DH
76
204
0
24 Apr 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
282
6,768
0
13 Apr 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality
  Speech Synthesis
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
DiffM
58
94
0
25 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
582
9,009
0
28 Jan 2022
HuBERT: Self-Supervised Speech Representation Learning by Masked
  Prediction of Hidden Units
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
127
2,879
0
14 Jun 2021
Pose-Controllable Talking Face Generation by Implicitly Modularized
  Audio-Visual Representation
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
Hang Zhou
Yasheng Sun
Wayne Wu
Chen Change Loy
Xiaogang Wang
Ziwei Liu
CVBM
79
363
0
22 Apr 2021
MeshTalk: 3D Face Animation from Speech using Cross-Modality
  Disentanglement
MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement
Alexander Richard
Michael Zollhoefer
Yandong Wen
Fernando de la Torre
Yaser Sheikh
CVBM
56
199
0
16 Apr 2021
Talking-head Generation with Rhythmic Head Motion
Talking-head Generation with Rhythmic Head Motion
Lele Chen
Guofeng Cui
Celong Liu
Zhong Li
Ziyi Kou
Yi Tian Xu
Chenliang Xu
41
182
0
16 Jul 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
179
5,734
0
20 Jun 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
271
42,038
0
03 Dec 2019
Realistic Speech-Driven Facial Animation with GANs
Realistic Speech-Driven Facial Animation with GANs
Konstantinos Vougioukas
Stavros Petridis
Maja Pantic
111
291
0
14 Jun 2019
Describing like humans: on diversity in image captioning
Describing like humans: on diversity in image captioning
Qingzhong Wang
Antoni B. Chan
43
99
0
28 Mar 2019
1