ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,756 papers shown
Title
MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation
MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation
Mingzhen Sun
Weining Wang
Yanyuan Qiao
Jiahui Sun
Zihan Qin
Longteng Guo
Xinxin Zhu
Jing Liu
DiffM
VGen
33
3
0
02 Oct 2024
Fake It Until You Break It: On the Adversarial Robustness of
  AI-generated Image Detectors
Fake It Until You Break It: On the Adversarial Robustness of AI-generated Image Detectors
Sina Mavali
Jonas Ricker
David Pape
Yash Sharma
Asja Fischer
Lea Schönherr
AAML
44
3
0
02 Oct 2024
KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
Pouyan Navard
Amin Karimi Monsefi
Mengxi Zhou
Wei-Lun Chao
Alper Yilmaz
R. Ramnath
DiffM
54
2
0
02 Oct 2024
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
Haotian Sun
Tao Lei
Bowen Zhang
Yanghao Li
Haoshuo Huang
Ruoming Pang
Bo Dai
Nan Du
DiffM
MoE
91
5
0
02 Oct 2024
Removing Distributional Discrepancies in Captions Improves Image-Text
  Alignment
Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
Yuheng Li
Haotian Liu
Mu Cai
Yijun Li
Eli Shechtman
Zhe Lin
Yong Jae Lee
Krishna Kumar Singh
VLM
246
1
0
01 Oct 2024
VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP
  Models
VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models
Jiapeng Wang
Chengyu Wang
Kunzhe Huang
Jun Huang
Lianwen Jin
CLIP
VLM
45
3
0
01 Oct 2024
Contrastive Abstraction for Reinforcement Learning
Contrastive Abstraction for Reinforcement Learning
Vihang Patil
M. Hofmarcher
Elisabeth Rumetshofer
Sepp Hochreiter
OffRL
SSL
37
2
0
01 Oct 2024
Scene Graph Disentanglement and Composition for Generalizable Complex
  Image Generation
Scene Graph Disentanglement and Composition for Generalizable Complex Image Generation
Yunnan Wang
Ziqiang Li
Zequn Zhang
Wenyao Zhang
Baao Xie
Xihui Liu
Wenjun Zeng
Xin Jin
CoGe
DiffM
28
2
0
01 Oct 2024
CusConcept: Customized Visual Concept Decomposition with Diffusion
  Models
CusConcept: Customized Visual Concept Decomposition with Diffusion Models
Zhi Xu
Shaozhe Hao
Kai Han
DiffM
30
4
0
01 Oct 2024
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in
  Text-to-Image Encoders through Causal Analysis and Embedding Optimization
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
Chieh-Yun Chen
Chiang Tseng
Li-Wu Tsao
Hong-Han Shuai
30
6
0
01 Oct 2024
RadGazeGen: Radiomics and Gaze-guided Medical Image Generation using
  Diffusion Models
RadGazeGen: Radiomics and Gaze-guided Medical Image Generation using Diffusion Models
Moinak Bhattacharya
Gagandeep Singh
Shubham Jain
Prateek Prasanna
MedIm
DiffM
39
1
0
01 Oct 2024
RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion
  Models
RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models
Jangyeong Kim
Donggoo Kang
Junyoung Choi
Jeonga Wi
Junho Gwon
Jiun Bae
Dumim Yoon
Junghyun Han
DiffM
39
1
0
30 Sep 2024
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We
  Learn How Vision-Language Models Function
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
Chenyi Zhuang
Ying Hu
Pan Gao
DiffM
VLM
55
11
0
30 Sep 2024
Image Copy Detection for Diffusion Models
Image Copy Detection for Diffusion Models
Wenhao Wang
Yifan Sun
Zhentao Tan
Yi Yang
35
1
0
30 Sep 2024
Illustrious: an Open Advanced Illustration Model
Illustrious: an Open Advanced Illustration Model
Sang Hyun Park
Jun Young Koh
Junha Lee
Joy Song
Dongha Kim
Hoyeon Moon
Hyunju Lee
Min Song
VLM
46
1
0
30 Sep 2024
Replace Anyone in Videos
Replace Anyone in Videos
Xiang Wang
Shiwei Zhang
Haonan Qiu
Ruihang Chu
Zekun Li
Yang Zhang
Changxin Gao
Yuehuan Wang
Chunhua Shen
Nong Sang
VGen
DiffM
71
1
0
30 Sep 2024
Text-driven Human Motion Generation with Motion Masked Diffusion Model
Text-driven Human Motion Generation with Motion Masked Diffusion Model
Xingyu Chen
DiffM
VGen
45
2
0
29 Sep 2024
Multimodal Misinformation Detection by Learning from Synthetic Data with
  Multimodal LLMs
Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs
Fengzhu Zeng
Wenqian Li
Wei Gao
Yan Pang
53
2
0
29 Sep 2024
Storynizor: Consistent Story Generation via Inter-Frame Synchronized and
  Shuffled ID Injection
Storynizor: Consistent Story Generation via Inter-Frame Synchronized and Shuffled ID Injection
Yuhang Ma
Wenting Xu
Chaoyi Zhao
Keqiang Sun
Qinfeng Jin
Zeng Zhao
Changjie Fan
Zhipeng Hu
VGen
DiffM
37
1
0
29 Sep 2024
Conditional Image Synthesis with Diffusion Models: A Survey
Conditional Image Synthesis with Diffusion Models: A Survey
Zheyuan Zhan
Defang Chen
Jian-Ping Mei
Zhenghe Zhao
Jiawei Chen
Chun Chen
Siwei Lyu
Can Wang
VLM
53
5
0
28 Sep 2024
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified
  Multiplet Upcycling
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling
Jihai Zhang
Xiaoye Qu
Tong Zhu
Yu Cheng
49
7
0
28 Sep 2024
Generalizing Consistency Policy to Visual RL with Prioritized Proximal
  Experience Regularization
Generalizing Consistency Policy to Visual RL with Prioritized Proximal Experience Regularization
Haoran Li
Zhennan Jiang
Yuhui Chen
Dongbin Zhao
OffRL
32
2
0
28 Sep 2024
SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from
  Documents guided by Multi-Aspect Feedback Refinement
SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from Documents guided by Multi-Aspect Feedback Refinement
Ishani Mondal
Zongxia Li
Yufang Hou
Anandhavelu Natarajan
Aparna Garimella
Jordan Boyd-Graber
41
3
0
28 Sep 2024
Multimodal Pragmatic Jailbreak on Text-to-image Models
Multimodal Pragmatic Jailbreak on Text-to-image Models
Tong Liu
Zhixin Lai
Gengyuan Zhang
Philip Torr
Vera Demberg
Volker Tresp
Jindong Gu
40
5
0
27 Sep 2024
Looking through the mind's eye via multimodal encoder-decoder networks
Looking through the mind's eye via multimodal encoder-decoder networks
Arman Afrasiyabi
E. L. Busch
Rahul Singh
Dhananjay Bhaskar
Laurent Caplette
Nicholas Turk-Browne
Smita Krishnaswamy
37
0
0
27 Sep 2024
Fusion is all you need: Face Fusion for Customized Identity-Preserving
  Image Synthesis
Fusion is all you need: Face Fusion for Customized Identity-Preserving Image Synthesis
Salaheldin Mohamed
Dong Han
Yong Li
23
1
0
27 Sep 2024
Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for
  Text-to-Image Synthesis
Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis
Songrui Wang
Yubo Zhu
Wei Tong
Sheng Zhong
WIGM
35
0
0
27 Sep 2024
Emu3: Next-Token Prediction is All You Need
Emu3: Next-Token Prediction is All You Need
Xinlong Wang
Xiaosong Zhang
Zhengxiong Luo
Quan-Sen Sun
Yufeng Cui
...
Xi Yang
Jingjing Liu
Yonghua Lin
Tiejun Huang
Zhongyuan Wang
MLLM
47
166
0
27 Sep 2024
Enhancing Explainability in Multimodal Large Language Models Using
  Ontological Context
Enhancing Explainability in Multimodal Large Language Models Using Ontological Context
Jihen Amara
B. König-Ries
Sheeba Samuel
29
1
0
27 Sep 2024
TensorSocket: Shared Data Loading for Deep Learning Training
TensorSocket: Shared Data Loading for Deep Learning Training
Ties Robroek
Neil Kim Nielsen
Pınar Tözün
31
2
0
27 Sep 2024
Gradient-free Decoder Inversion in Latent Diffusion Models
Gradient-free Decoder Inversion in Latent Diffusion Models
Seongmin Hong
Suh Yoon Jeon
Kyeonghyun Lee
Ernest K. Ryu
S. Chun
36
0
0
27 Sep 2024
GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture
  Generation
GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture Generation
Jiawei Lu
Yingpeng Zhang
Zengjun Zhao
He Wang
Kun Zhou
Tianjia Shao
61
4
0
27 Sep 2024
Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving
Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving
Zhenghao Peng
Wenjie Luo
Yiren Lu
Tianyi Shen
Cole Gulino
Ari Seff
Justin Fu
34
6
0
26 Sep 2024
Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey
Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey
Yi Zhang
Zhen Chen
Chih-Hong Cheng
Wenjie Ruan
Xiaowei Huang
Dezong Zhao
David Flynn
Siddartha Khastgir
Xingyu Zhao
MedIm
53
4
0
26 Sep 2024
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He
Haodong Li
Wei Yin
Yixun Liang
Leheng Li
Kaiqiang Zhou
Hongbo Zhang
Bingbing Liu
Ying-Cong Chen
DiffM
VLM
57
40
0
26 Sep 2024
Stable Video Portraits
Stable Video Portraits
Mirela Ostrek
Justus Thies
VGen
DiffM
35
1
0
26 Sep 2024
FreeEdit: Mask-free Reference-based Image Editing with Multi-modal
  Instruction
FreeEdit: Mask-free Reference-based Image Editing with Multi-modal Instruction
Runze He
Kai Ma
Linjiang Huang
Shaofei Huang
Jialin Gao
Xiaoming Wei
Jiao Dai
Jizhong Han
Si Liu
DiffM
52
8
0
26 Sep 2024
AnyLogo: Symbiotic Subject-Driven Diffusion System with Gemini Status
AnyLogo: Symbiotic Subject-Driven Diffusion System with Gemini Status
Jinghao Zhang
Wen Qian
Hao Luo
Fan Wang
Feng Zhao
DiffM
32
0
0
26 Sep 2024
Flexiffusion: Segment-wise Neural Architecture Search for Flexible
  Denoising Schedule
Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule
Hongtao Huang
Xiaojun Chang
Lina Yao
38
0
0
26 Sep 2024
Pixel-Space Post-Training of Latent Diffusion Models
Pixel-Space Post-Training of Latent Diffusion Models
Christina Zhang
Simran Motwani
Matthew Yu
Ji Hou
Felix Juefei-Xu
Sam S. Tsai
Peter Vajda
Zijian He
Jialiang Wang
39
2
0
26 Sep 2024
JoyType: A Robust Design for Multilingual Visual Text Creation
JoyType: A Robust Design for Multilingual Visual Text Creation
Chao Li
Chen Jiang
Xiaolong Liu
Jun Zhao
Guoxin Wang
DiffM
59
6
0
26 Sep 2024
Copying style, Extracting value: Illustrators' Perception of AI Style
  Transfer and its Impact on Creative Labor
Copying style, Extracting value: Illustrators' Perception of AI Style Transfer and its Impact on Creative Labor
Julien Porquet
Sitong Wang
Lydia B. Chilton
42
2
0
25 Sep 2024
DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D
  Diffusion
DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion
Yukun Huang
Jianan Wang
Ailing Zeng
Zheng-Jun Zha
Lei Zhang
Xihui Liu
3DGS
47
5
0
25 Sep 2024
Prompt Sliders for Fine-Grained Control, Editing and Erasing of Concepts
  in Diffusion Models
Prompt Sliders for Fine-Grained Control, Editing and Erasing of Concepts in Diffusion Models
Deepak Sridhar
Nuno Vasconcelos
DiffM
36
0
0
25 Sep 2024
PAGE: A Modern Measure of Emotion Perception for Teamwork and Management
  Research
PAGE: A Modern Measure of Emotion Perception for Teamwork and Management Research
Ben Weidmann
Yixian Xu
17
0
0
24 Sep 2024
MonoFormer: One Transformer for Both Diffusion and Autoregression
MonoFormer: One Transformer for Both Diffusion and Autoregression
Chuyang Zhao
Yuxing Song
Wenhao Wang
Haocheng Feng
Errui Ding
Yifan Sun
Xinyan Xiao
Jingdong Wang
DiffM
39
18
0
24 Sep 2024
Unimotion: Unifying 3D Human Motion Synthesis and Understanding
Unimotion: Unifying 3D Human Motion Synthesis and Understanding
Chuqiao Li
Julian Chibane
Yannan He
Naama Pearl
Andreas Geiger
Gerard Pons-Moll
VGen
49
8
0
24 Sep 2024
Zero-Shot Detection of AI-Generated Images
Zero-Shot Detection of AI-Generated Images
D. Cozzolino
Giovanni Poggi
Matthias Nießner
L. Verdoliva
58
11
0
24 Sep 2024
Training Data Attribution: Was Your Model Secretly Trained On Data
  Created By Mine?
Training Data Attribution: Was Your Model Secretly Trained On Data Created By Mine?
Likun Zhang
Hao Wu
Lefei Zhang
Fengyuan Xu
Jin Cao
Fenghua Li
Ben Niu
TDI
28
1
0
24 Sep 2024
Multi-Modal Generative AI: Multi-modal LLM, Diffusion and Beyond
Multi-Modal Generative AI: Multi-modal LLM, Diffusion and Beyond
Hong Chen
Xin Wang
Yuwei Zhou
Bin Huang
Yipeng Zhang
Wei Feng
Houlun Chen
Zeyang Zhang
Siao Tang
Wenwu Zhu
DiffM
55
7
0
23 Sep 2024
Previous
123...161718...949596
Next