ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.09773
  4. Cited By
Speech2Face: Learning the Face Behind a Voice

Speech2Face: Learning the Face Behind a Voice

23 May 2019
Tae-Hyun Oh
Tali Dekel
Changil Kim
Inbar Mosseri
William T. Freeman
Michael Rubinstein
Wojciech Matusik
    SSL
    CVBM
ArXivPDFHTML

Papers citing "Speech2Face: Learning the Face Behind a Voice"

29 / 29 papers shown
Title
Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator
Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator
Minjae Kang
Martim Brandão
64
0
0
25 Apr 2025
Hear Your Face: Face-based voice conversion with F0 estimation
Hear Your Face: Face-based voice conversion with F0 estimation
Jaejun Lee
Yoori Oh
Injune Hwang
Kyogu Lee
CVBM
29
2
0
19 Aug 2024
Fighting Malicious Media Data: A Survey on Tampering Detection and
  Deepfake Detection
Fighting Malicious Media Data: A Survey on Tampering Detection and Deepfake Detection
Junke Wang
Zhenxin Li
Chao Zhang
Jingjing Chen
Zuxuan Wu
Larry S. Davis
Yueping Jiang
AAML
40
5
0
12 Dec 2022
Robust Sound-Guided Image Manipulation
Robust Sound-Guided Image Manipulation
Seung Hyun Lee
Gyeongrok Oh
Wonmin Byeon
Sang Ho Yoon
Jinkyu Kim
Sangpil Kim
DiffM
26
7
0
30 Aug 2022
Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors
Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors
Sindhu B. Hegde
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
CVBM
16
1
0
17 Aug 2022
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Chuang Gan
Yi Gu
Siyuan Zhou
Jeremy Schwartz
S. Alter
James Traer
Dan Gutfreund
J. Tenenbaum
Josh H. McDermott
Antonio Torralba
57
19
0
07 Jul 2022
Show Me Your Face, And I'll Tell You How You Speak
Show Me Your Face, And I'll Tell You How You Speak
Christen Millerdurai
L. A. Khaliq
Timon Ulrich
CVBM
68
0
0
28 Jun 2022
Text/Speech-Driven Full-Body Animation
Text/Speech-Driven Full-Body Animation
Wenlin Zhuang
Jinwei Qi
Peng Zhang
Bang Zhang
Ping Tan
33
6
0
31 May 2022
RadioSES: mmWave-Based Audioradio Speech Enhancement and Separation
  System
RadioSES: mmWave-Based Audioradio Speech Enhancement and Separation System
M. Z. Ozturk
Chenshu Wu
Beibei Wang
Min Wu
K. Liu
27
20
0
14 Apr 2022
Residual-guided Personalized Speech Synthesis based on Face Image
Residual-guided Personalized Speech Synthesis based on Face Image
Jianrong Wang
Zixuan Wang
Xiaosheng Hu
Xuewei Li
Qiang Fang
Li Liu
CVBM
27
16
0
01 Apr 2022
Sound-Guided Semantic Image Manipulation
Sound-Guided Semantic Image Manipulation
Seung Hyun Lee
Wonseok Roh
Wonmin Byeon
Sang Ho Yoon
Chanyoung Kim
Jinkyu Kim
Sangpil Kim
DiffM
33
43
0
30 Nov 2021
Cross-Modal Virtual Sensing for Combustion Instability Monitoring
Cross-Modal Virtual Sensing for Combustion Instability Monitoring
Tryambak Gangopadhyay
V. Ramanan
S. Chakravarthy
S. Sarkar
21
1
0
04 Oct 2021
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection
Hugo C. C. Carneiro
C. Weber
S. Wermter
CVBM
31
7
0
01 Sep 2021
Cross-modal Spectrum Transformation Network For Acoustic Scene
  classification
Cross-modal Spectrum Transformation Network For Acoustic Scene classification
Yang Liu
A. Neophytou
Sunando Sengupta
Eric Sommerlade
21
9
0
13 Aug 2021
Speech2Video: Cross-Modal Distillation for Speech to Video Generation
Speech2Video: Cross-Modal Distillation for Speech to Video Generation
Shijing Si
Jianzong Wang
Xiaoyang Qu
Ning Cheng
Wenqi Wei
Xinghua Zhu
Jing Xiao
VGen
29
15
0
10 Jul 2021
Honest-but-Curious Nets: Sensitive Attributes of Private Inputs Can Be
  Secretly Coded into the Classifiers' Outputs
Honest-but-Curious Nets: Sensitive Attributes of Private Inputs Can Be Secretly Coded into the Classifiers' Outputs
Mohammad Malekzadeh
Anastasia Borovykh
Deniz Gündüz
MIACV
24
42
0
25 May 2021
A cappella: Audio-visual Singing Voice Separation
A cappella: Audio-visual Singing Voice Separation
Juan F. Montesinos
V. S. Kadandale
G. Haro
40
16
0
20 Apr 2021
SoK: A Modularized Approach to Study the Security of Automatic Speech
  Recognition Systems
SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems
Yuxuan Chen
Jiangshan Zhang
Xuejing Yuan
Shengzhi Zhang
Kai Chen
Xiaofeng Wang
Shanqing Guo
AAML
37
15
0
19 Mar 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
196
199
0
08 Jan 2021
Multimodal Target Speech Separation with Voice and Face References
Multimodal Target Speech Separation with Voice and Face References
Leyuan Qu
C. Weber
S. Wermter
CVBM
19
19
0
17 May 2020
FaceFilter: Audio-visual speech separation using still images
FaceFilter: Audio-visual speech separation using still images
Soo-Whan Chung
Soyeon Choe
Joon Son Chung
Hong-Goo Kang
CVBM
21
66
0
14 May 2020
S2IGAN: Speech-to-Image Generation via Adversarial Learning
S2IGAN: Speech-to-Image Generation via Adversarial Learning
Xinsheng Wang
Tingting Qiao
Jihua Zhu
Alan Hanjalic
O. Scharenborg
VLM
GAN
32
16
0
14 May 2020
APB2Face: Audio-guided face reenactment with auxiliary pose and blink
  signals
APB2Face: Audio-guided face reenactment with auxiliary pose and blink signals
Jiangning Zhang
L. Liu
Zhucun Xue
Yong Liu
CVBM
28
16
0
30 Apr 2020
Direct Speech-to-image Translation
Direct Speech-to-image Translation
Jiguo Li
Xinfeng Zhang
Chuanmin Jia
Jizheng Xu
Li Zhang
Y. Wang
Siwei Ma
Wen Gao
36
29
0
07 Apr 2020
Vocoder-Based Speech Synthesis from Silent Videos
Vocoder-Based Speech Synthesis from Silent Videos
Daniel Michelsanti
Olga Slizovskaia
G. Haro
Emilia Gómez
Zheng-Hua Tan
Jesper Jensen
31
31
0
06 Apr 2020
Audio-driven Talking Face Video Generation with Learning-based
  Personalized Head Pose
Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose
Ran Yi
Zipeng Ye
Juyong Zhang
Hujun Bao
Yong-jin Liu
CVBM
27
122
0
24 Feb 2020
Deep Audio-Visual Learning: A Survey
Deep Audio-Visual Learning: A Survey
Hao Zhu
Mandi Luo
Rui Wang
A. Zheng
Ran He
31
156
0
14 Jan 2020
Learning to Have an Ear for Face Super-Resolution
Learning to Have an Ear for Face Super-Resolution
Givi Meishvili
Simon Jenni
Paolo Favaro
SupR
CVBM
33
23
0
27 Sep 2019
Lip Reading Sentences in the Wild
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
185
784
0
16 Nov 2016
1