ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.10126
  4. Cited By
Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for
  Speech-to-Image Generation

Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation

17 May 2023
Zhenxing Zhang
Lambert Schomaker
ArXivPDFHTML

Papers citing "Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation"

17 / 17 papers shown
Title
DiverGAN: An Efficient and Effective Single-Stage Framework for Diverse
  Text-to-Image Generation
DiverGAN: An Efficient and Effective Single-Stage Framework for Diverse Text-to-Image Generation
Zhenxing Zhang
Lambert Schomaker
DiffM
31
24
0
17 Nov 2021
Contextual Transformer Networks for Visual Recognition
Contextual Transformer Networks for Visual Recognition
Yehao Li
Ting Yao
Yingwei Pan
Tao Mei
ViT
68
478
0
26 Jul 2021
Polarized Self-Attention: Towards High-quality Pixel-wise Regression
Polarized Self-Attention: Towards High-quality Pixel-wise Regression
Huajun Liu
Fuqiang Liu
Xinyi Fan
Dong Huang
121
217
0
02 Jul 2021
Phone-Level Prosody Modelling with GMM-Based MDN for Diverse and
  Controllable Speech Synthesis
Phone-Level Prosody Modelling with GMM-Based MDN for Diverse and Controllable Speech Synthesis
Chenpeng Du
K. Yu
128
20
0
27 May 2021
Comprehensive Image Captioning via Scene Graph Decomposition
Comprehensive Image Captioning via Scene Graph Decomposition
Yiwu Zhong
Liwei Wang
Jianshu Chen
Dong Yu
Yin Li
115
125
0
23 Jul 2020
Controllable Text-to-Image Generation
Controllable Text-to-Image Generation
Bowen Li
Xiaojuan Qi
Thomas Lukasiewicz
Philip Torr
GAN
84
354
0
16 Sep 2019
Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial
  Networks
Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks
A. Duarte
Francisco Roldan
Miquel Tubau
Janna Escur
Santiago Pascual
Amaia Salvador
Eva Mohedano
Kevin McGuinness
Jordi Torres
Xavier Giró-i-Nieto
GAN
CVBM
55
79
0
25 Mar 2019
MirrorGAN: Learning Text-to-image Generation by Redescription
MirrorGAN: Learning Text-to-image Generation by Redescription
Tingting Qiao
Jing Zhang
Duanqing Xu
Dacheng Tao
VLM
GAN
61
539
0
14 Mar 2019
BAM: Bottleneck Attention Module
BAM: Bottleneck Attention Module
Jongchan Park
Sanghyun Woo
Joon-Young Lee
In So Kweon
67
1,039
0
17 Jul 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
77
2,694
0
16 Dec 2017
AttnGAN: Fine-Grained Text to Image Generation with Attentional
  Generative Adversarial Networks
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Tao Xu
Pengchuan Zhang
Qiuyuan Huang
Han Zhang
Zhe Gan
Xiaolei Huang
Xiaodong He
GAN
ViT
105
1,714
0
28 Nov 2017
StackGAN++: Realistic Image Synthesis with Stacked Generative
  Adversarial Networks
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
Han Zhang
Tao Xu
Hongsheng Li
Shaoting Zhang
Xiaogang Wang
Xiaolei Huang
Dimitris N. Metaxas
GAN
77
1,057
0
19 Oct 2017
Improved Techniques for Training GANs
Improved Techniques for Training GANs
Tim Salimans
Ian Goodfellow
Wojciech Zaremba
Vicki Cheung
Alec Radford
Xi Chen
GAN
454
9,027
0
10 Jun 2016
Rethinking the Inception Architecture for Computer Vision
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
688
27,303
0
02 Dec 2015
Deep Multimodal Semantic Embeddings for Speech and Images
Deep Multimodal Semantic Embeddings for Speech and Images
David Harwath
James R. Glass
55
157
0
11 Nov 2015
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.4K
39,472
0
01 Sep 2014
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
623
31,469
0
16 Jan 2013
1