ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.10010
  4. Cited By
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The
  Wild

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

23 August 2020
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
    EGVM
ArXivPDFHTML

Papers citing "A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild"

50 / 410 papers shown
Title
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization
Detao Bai
Zhiheng Ma
Xihan Wei
Liefeng Bo
138
0
0
06 May 2025
OT-Talk: Animating 3D Talking Head with Optimal Transportation
OT-Talk: Animating 3D Talking Head with Optimal Transportation
Xinmu Wang
Xiang Gao
Xiyun Song
Heather Yu
Zongfang Lin
Liang Peng
Xianfeng Gu
24
0
0
03 May 2025
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Gaoxiang Cong
Liang-Sheng Li
Jiadong Pan
Zhedong Zhang
Amin Beheshti
Anton Van Den Hengel
Yuankai Qi
Qingming Huang
150
0
0
02 May 2025
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
Antoni Bigata
Rodrigo Mira
Stella Bounareli
Michał Stypułkowski
Konstantinos Vougioukas
Stavros Petridis
Maja Pantic
52
0
0
01 May 2025
IRL Dittos: Embodied Multimodal AI Agent Interactions in Open Spaces
IRL Dittos: Embodied Multimodal AI Agent Interactions in Open Spaces
Seonghee Lee
Denae Ford
John Tang
Sasa Junuzovic
Asta Roseway
Ed Cutrell
Kori Inkpen
37
0
0
30 Apr 2025
MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance
MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance
Mengting Wei
Yante Li
Tuomas Varanka
Yan Jiang
Guoying Zhao
DiffM
VGen
74
0
0
30 Apr 2025
AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
J. Choi
Ji-Hoon Kim
Kim Sung-Bin
Tae-Hyun Oh
Joon Son Chung
DiffM
49
0
0
29 Apr 2025
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion
Zehua Wang
Alexandre Bruckert
P. Le Callet
Guangtao Zhai
VGen
32
0
0
29 Apr 2025
Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning
Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning
Yifan Xie
Fei Ma
Yi Bin
Ying He
Fei Richard Yu
57
0
0
26 Apr 2025
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation
Weipeng Tan
Chuming Lin
Chengming Xu
F. Xu
Xiaobin Hu
Xiaozhong Ji
Junwei Zhu
Chengjie Wang
Yanwei Fu
49
0
0
25 Apr 2025
Contrastive Decoupled Representation Learning and Regularization for Speech-Preserving Facial Expression Manipulation
Contrastive Decoupled Representation Learning and Regularization for Speech-Preserving Facial Expression Manipulation
Tianshui Chen
Jianman Lin
Zhijing Yang
Chumei Qing
Yukai Shi
Liang Lin
47
2
0
08 Apr 2025
SE4Lip: Speech-Lip Encoder for Talking Head Synthesis to Solve Phoneme-Viseme Alignment Ambiguity
SE4Lip: Speech-Lip Encoder for Talking Head Synthesis to Solve Phoneme-Viseme Alignment Ambiguity
Yihuan Huang
Jiajun Liu
Yanzhen Ren
Wuyang Liu
Juhua Tang
24
0
0
08 Apr 2025
Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation
Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation
Zhihua Xu
Tianshui Chen
Zhijing Yang
Siyuan Peng
Keze Wang
Liang Lin
29
0
0
08 Apr 2025
FluentLip: A Phonemes-Based Two-stage Approach for Audio-Driven Lip Synthesis with Optical Flow Consistency
FluentLip: A Phonemes-Based Two-stage Approach for Audio-Driven Lip Synthesis with Optical Flow Consistency
Shiyan Liu
Rui Qu
Yan Jin
31
0
0
06 Apr 2025
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov
Di Chang
Minh Tran
Hongkun Gong
Ashutosh Chaubey
Mohammad Soleymani
DiffM
VGen
23
0
0
05 Apr 2025
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
Fa-Ting Hong
Zunnan Xu
Zixiang Zhou
Zhiqiang Zhang
Xiu Li
Qin Lin
Qinglin Lu
D. Xu
DiffM
VGen
59
2
0
03 Apr 2025
OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication
OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication
Zhongjian Wang
Peng Zhang
Jinwei Qi
Guangyuan Wang Sheng Xu
Bang Zhang
Liefeng Bo
DiffM
VGen
38
0
0
03 Apr 2025
Detecting Lip-Syncing Deepfakes: Vision Temporal Transformer for Analyzing Mouth Inconsistencies
Detecting Lip-Syncing Deepfakes: Vision Temporal Transformer for Analyzing Mouth Inconsistencies
Soumyya Kanti Datta
Shan Jia
Siwei Lyu
44
0
0
02 Apr 2025
Monocular and Generalizable Gaussian Talking Head Animation
Monocular and Generalizable Gaussian Talking Head Animation
Shengjie Gong
Yiming Li
Jiapeng Tang
Dongming Hu
Shuangping Huang
Hao Chen
Tianshui Chen
Zhuoman Liu
3DGS
41
1
0
01 Apr 2025
MoCha: Towards Movie-Grade Talking Character Synthesis
MoCha: Towards Movie-Grade Talking Character Synthesis
Cong Wei
Bo Sun
Haoyu Ma
Ji Hou
F. Xu
...
Kunpeng Li
Tingbo Hou
Animesh Sinha
Peter Vajda
Wenhu Chen
VGen
149
0
0
30 Mar 2025
STSA: Spatial-Temporal Semantic Alignment for Visual Dubbing
STSA: Spatial-Temporal Semantic Alignment for Visual Dubbing
Zijun Ding
Mingdie Xiong
Congcong Zhu
Jingrun Chen
DiffM
61
0
0
29 Mar 2025
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model
Jinwei Qi
Chaonan Ji
Sheng Xu
Peng Zhang
Bang Zhang
Liefeng Bo
DiffM
VGen
45
1
0
27 Mar 2025
Dual Audio-Centric Modality Coupling for Talking Head Generation
Dual Audio-Centric Modality Coupling for Talking Head Generation
Ao Fu
Ziqi Ni
Yi Zhou
37
1
0
26 Mar 2025
Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
Lee Chae-Yeon
Oh Hyun-Bin
Han EunGi
Kim Sung-Bin
Suekyeong Nam
Tae-Hyun Oh
EGVM
3DH
87
0
1
26 Mar 2025
Video Motion Graphs
Video Motion Graphs
Haiyang Liu
Zhan Xu
Fa-Ting Hong
Hsin-Ping Huang
Yi Zhou
Yang Zhou
DiffM
VGen
90
0
0
26 Mar 2025
EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters
EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters
Xuli Shen
Hua Cai
Dingding Yu
Weilin Shen
Qing-Song Xu
Xiangyang Xue
37
0
0
25 Mar 2025
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Jiazhi Guan
Kaisiyuan Wang
Zhiliang Xu
Quanwei Yang
Yasheng Sun
...
Errui Ding
Jiadong Wang
Youjian Zhao
Hang Zhou
Ziwei Liu
VGen
44
0
0
25 Mar 2025
MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation
MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation
Yukang Lin
Hokit Fung
Jianjin Xu
Zeping Ren
Adela S.M. Lau
Guosheng Yin
Xiu Li
VGen
47
5
0
25 Mar 2025
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model
Kangwei Liu
Junwu Liu
Yun Cao
Jinlin Guo
Xiaowei Yi
DiffM
45
0
0
24 Mar 2025
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation
Dingcheng Zhen
Shunshun Yin
Shiyang Qin
Hou Yi
Ziwei Zhang
Siyuan Liu
Gan Qi
Ming Tao
VGen
74
0
0
24 Mar 2025
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model
Yingying Fan
Quanwei Yang
Kaisiyuan Wang
Hang Zhou
Yingying Li
Haocheng Feng
Errui Ding
Y. Wu
Jiadong Wang
DiffM
49
0
0
21 Mar 2025
TruthLens: Explainable DeepFake Detection for Face Manipulated and Fully Synthetic Data
TruthLens: Explainable DeepFake Detection for Face Manipulated and Fully Synthetic Data
Rohit Kundu
Athula Balachandran
A. Roy-Chowdhury
45
0
0
20 Mar 2025
UniSync: A Unified Framework for Audio-Visual Synchronization
UniSync: A Unified Framework for Audio-Visual Synchronization
Tao Feng
Yifan Xie
Xun Guan
Jiyuan Song
Z. Liu
Fei Ma
Fei Richard Yu
78
1
0
20 Mar 2025
3D Engine-ready Photorealistic Avatars via Dynamic Textures
3D Engine-ready Photorealistic Avatars via Dynamic Textures
Yifan Wang
Ivan Molodetskikh
Ondrej Texler
Dimitar Dinev
45
0
0
19 Mar 2025
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation
Baiqin Wang
Xiangyu Zhu
Fan Shen
Hao-Xuan Xu
Zhen Lei
63
0
0
18 Mar 2025
SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization
SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization
Xulin Fan
Heting Gao
Ziyi Chen
Peng Chang
Mei Han
Mark Hasegawa-Johnson
DiffM
62
0
0
17 Mar 2025
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait
Chaolong Yang
Kai Yao
Yuyao Yan
Chenru Jiang
Weiguang Zhao
Jie Sun
Guangliang Cheng
Yifei Zhang
Bin Dong
K. Huang
DiffM
69
0
0
17 Mar 2025
RASA: Replace Anyone, Say Anything -- A Training-Free Framework for Audio-Driven and Universal Portrait Video Editing
Tianrui Pan
Lin Liu
Jie Liu
Xinsong Zhang
J. Tang
Gangshan Wu
Q. Tian
DiffM
VGen
53
0
0
14 Mar 2025
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
Sungwoo Cho
J. Choi
Sungnyun Kim
Se-Young Yun
63
0
0
14 Mar 2025
Semantic Latent Motion for Portrait Video Generation
Qiyuan Zhang
Chenyu Wu
Wenzhang Sun
Huaize Liu
Donglin Di
Wei Chen
Changqing Zou
VGen
72
0
0
13 Mar 2025
Removing Averaging: Personalized Lip-Sync Driven Characters Based on Identity Adapter
Yanyu Zhu
Licheng Bai
Jintao Xu
Jiwei Tang
Wanshi Xu
38
0
0
09 Mar 2025
DiVISe: Direct Visual-Input Speech Synthesis Preserving Speaker Characteristics And Intelligibility
Yifan Liu
Yu Fang
Zhouhan Lin
42
0
0
07 Mar 2025
FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait Synthesis
FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait Synthesis
Ziqi Ni
Ao Fu
Yi Zhou
61
0
0
06 Mar 2025
Personalized Generation In Large Model Era: A Survey
Yiyan Xu
Jinghao Zhang
Alireza Salemi
Xinting Hu
Luu Anh Tuan
Fuli Feng
Hamed Zamani
Xiangnan He
Tat-Seng Chua
3DV
79
2
0
04 Mar 2025
KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation
KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation
Antoni Bigata
Michał Stypułkowski
Rodrigo Mira
Stella Bounareli
Konstantinos Vougioukas
Zoe Landgraf
Nikita Drobyshev
Maciej Ziȩba
Stavros Petridis
M. Pantic
DiffM
VGen
70
2
0
03 Mar 2025
InsTaG: Learning Personalized 3D Talking Head from Few-Second Video
InsTaG: Learning Personalized 3D Talking Head from Few-Second Video
Jiahe Li
Jiawei Zhang
Xiao Bai
Jin Zheng
J. Zhou
L. Gu
62
0
0
27 Feb 2025
Steganography Beyond Space-Time with Chain of Multimodal AI
Steganography Beyond Space-Time with Chain of Multimodal AI
Ching-Chun Chang
Isao Echizen
74
0
0
25 Feb 2025
Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation
Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation
Baptiste Chopin
Tashvik Dhamija
P. Balaji
Yaohui Wang
A. Dantcheva
DiffM
VGen
49
0
0
24 Feb 2025
A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond
A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond
Shreya Shukla
Jose Torres
Abhijit Mishra
Jacek Gwizdka
Shounak Roychowdhury
45
0
0
20 Feb 2025
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffM
VGen
61
3
0
17 Feb 2025
123456789
Next