ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.03502
  4. Cited By
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and
  Latent Diffusion

Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion

5 October 2023
Anton Razzhigaev
Arseniy Shakhmatov
Anastasia Maltseva
V.Ya. Arkhipkin
Igor Pavlov
Ilya Ryabov
Angelina Kuts
Alexander Panchenko
Andrey Kuznetsov
Denis Dimitrov
ArXivPDFHTML

Papers citing "Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion"

23 / 23 papers shown
Title
CRAFT: Cultural Russian-Oriented Dataset Adaptation for Focused Text-to-Image Generation
CRAFT: Cultural Russian-Oriented Dataset Adaptation for Focused Text-to-Image Generation
Viacheslav Vasilev
V. Arkhipkin
Julia Agafonova
Tatiana Nikulina
Evelina Mironova
Alisa Shichanina
Nikolai Gerasimenko
Mikhail Shoytov
Denis Dimitrov
46
0
0
07 May 2025
Defining and Quantifying Creative Behavior in Popular Image Generators
Defining and Quantifying Creative Behavior in Popular Image Generators
Aditi Ramaswamy
Hana Chockler
Melane Navaratnarajah
31
0
0
07 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Xuzhi Zhang
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
74
0
0
05 May 2025
Decoding Vision Transformers: the Diffusion Steering Lens
Decoding Vision Transformers: the Diffusion Steering Lens
Ryota Takatsuki
Sonia Joseph
Ippei Fujisawa
Ryota Kanai
DiffM
30
0
0
18 Apr 2025
Teaching Humans Subtle Differences with DIFFusion
Teaching Humans Subtle Differences with DIFFusion
Mia Chiquier
Orr Avrech
Yossi Gandelsman
Berthy T. Feng
Katherine L. Bouman
Carl Vondrick
DiffM
51
0
0
10 Apr 2025
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
Justus Westerhoff
Erblina Purellku
Jakob Hackstein
Jonas Loos
Leo Pinetzki
Lorenz Hufe
AAML
28
0
0
07 Apr 2025
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
Bowen Jiang
Yuan Yuan
Xinyi Bai
Zhuoqun Hao
Alyson Yin
Yaojie Hu
Wenyu Liao
Lyle Ungar
Camillo J Taylor
DiffM
51
1
0
16 Feb 2025
PreciseCam: Precise Camera Control for Text-to-Image Generation
PreciseCam: Precise Camera Control for Text-to-Image Generation
Edurne Bernal-Berdun
Ana Serrano
B. Masiá
Matheus Gadelha
Yannick Hold-Geoffroy
Xin Sun
Diego F. F. Gutierrez
DiffM
VGen
47
0
0
22 Jan 2025
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Luping Liu
Chao Du
Tianyu Pang
Zehan Wang
Chongxuan Li
Dong Xu
VLM
53
4
0
15 Oct 2024
CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization
CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization
Nan Chen
Mengqi Huang
Zhuowei Chen
Yang Zheng
Lei Zhang
Zhendong Mao
DiffM
49
5
0
09 Sep 2024
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
X. Wang
Siming Fu
Qihan Huang
Wanggui He
Hao Jiang
DiffM
48
41
0
11 Jun 2024
Margin-aware Preference Optimization for Aligning Diffusion Models
  without Reference
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Jiwoo Hong
Sayak Paul
Noah Lee
Kashif Rasul
James Thorne
Jongheon Jeong
43
13
0
10 Jun 2024
Evaluating Durability: Benchmark Insights into Multimodal Watermarking
Evaluating Durability: Benchmark Insights into Multimodal Watermarking
Jielin Qiu
William Jongwon Han
Xuandong Zhao
Shangbang Long
Christos Faloutsos
Lei Li
65
1
0
06 Jun 2024
Training-free Subject-Enhanced Attention Guidance for Compositional
  Text-to-image Generation
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
44
2
0
11 May 2024
Navigating the Landscape of Large Language Models: A Comprehensive
  Review and Analysis of Paradigms and Fine-Tuning Strategies
Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies
Benjue Weng
LM&MA
46
7
0
13 Apr 2024
Kiki or Bouba? Sound Symbolism in Vision-and-Language Models
Kiki or Bouba? Sound Symbolism in Vision-and-Language Models
Morris Alper
Hadar Averbuch-Elor
43
10
0
25 Oct 2023
Differential Diffusion: Giving Each Pixel Its Strength
Differential Diffusion: Giving Each Pixel Its Strength
E. Levin
Ohad Fried
DiffM
37
20
0
01 Jun 2023
Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion
  Prior
Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior
Junshu Tang
Tengfei Wang
Bo Zhang
Ting Zhang
Ran Yi
Lizhuang Ma
Dong Chen
DiffM
192
307
0
24 Mar 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video
  Generation
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
Zhengxiong Luo
Dayou Chen
Yingya Zhang
Yan Huang
Liangsheng Wang
Yujun Shen
Deli Zhao
Jinren Zhou
Tien-Ping Tan
DiffM
VGen
132
215
0
15 Mar 2023
MagicMix: Semantic Mixing with Diffusion Models
MagicMix: Semantic Mixing with Diffusion Models
Jun Hao Liew
Hanshu Yan
Daquan Zhou
Jiashi Feng
DiffM
184
60
0
28 Oct 2022
MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation
MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation
Chuanxia Zheng
L. Vuong
Jianfei Cai
Dinh Q. Phung
MQ
71
72
0
19 Sep 2022
Palette: Image-to-Image Diffusion Models
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffM
VLM
342
1,591
0
10 Nov 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
1