Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.06968
Cited By
S2IGAN: Speech-to-Image Generation via Adversarial Learning
14 May 2020
Xinsheng Wang
Tingting Qiao
Jihua Zhu
Alan Hanjalic
O. Scharenborg
VLM
GAN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"S2IGAN: Speech-to-Image Generation via Adversarial Learning"
8 / 8 papers shown
Title
SViQA: A Unified Speech-Vision Multimodal Model for Textless Visual Question Answering
Bingxin Li
30
0
0
01 Apr 2025
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
Minsu Kim
Jee-weon Jung
Hyeongseop Rha
Soumi Maiti
Siddhant Arora
Xuankai Chang
Shinji Watanabe
Y. Ro
28
7
0
25 Feb 2024
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language Question
Yuanfeng Song
Raymond Chi-Wing Wong
Xuefang Zhao
Di Jiang
36
13
0
04 Jan 2022
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Jing Liu
Xinxin Zhu
Fei Liu
Longteng Guo
Zijia Zhao
...
Weining Wang
Hanqing Lu
Shiyu Zhou
Jiajun Zhang
Jinqiao Wang
31
37
0
01 Jul 2021
AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style
Stanislav Frolov
Avneesh Sharma
Jörn Hees
Tushar Karayil
Federico Raue
Andreas Dengel
20
15
0
25 Mar 2021
Adversarial Text-to-Image Synthesis: A Review
Stanislav Frolov
Tobias Hinz
Federico Raue
Jörn Hees
Andreas Dengel
EGVM
22
175
0
25 Jan 2021
Show and Speak: Directly Synthesize Spoken Description of Images
Xinsheng Wang
Siyuan Feng
Jihua Zhu
M. Hasegawa-Johnson
O. Scharenborg
20
4
0
23 Oct 2020
Learning Deep Representations of Fine-grained Visual Descriptions
Scott E. Reed
Zeynep Akata
Bernt Schiele
Honglak Lee
OCL
VLM
170
840
0
17 May 2016
1