ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,775 papers shown
Title
Data-Centric AI in the Age of Large Language Models
Data-Centric AI in the Age of Large Language Models
Xinyi Xu
Zhaoxuan Wu
Rui Qiao
Arun Verma
Yao Shu
...
Xiaoqiang Lin
Wenyang Hu
Zhongxiang Dai
Pang Wei Koh
Bryan Kian Hsiang Low
ALM
63
3
0
20 Jun 2024
EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in
  Text-to-image Diffusion Models with Minimal and Robust Alterations
EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations
Jie Ren
Yingqian Cui
Chen Chen
Vikash Sehwag
Yue Xing
Jiliang Tang
Lingjuan Lyu
WIGM
42
1
0
20 Jun 2024
Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation
Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation
Eyal Michaeli
Ohad Fried
62
1
0
20 Jun 2024
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual
  Generation
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Baiqi Li
Zhiqiu Lin
Deepak Pathak
Jiayao Li
Yixin Fei
...
Tiffany Ling
Xide Xia
Pengchuan Zhang
Graham Neubig
Deva Ramanan
EGVM
59
29
0
19 Jun 2024
StableSemantics: A Synthetic Language-Vision Dataset of Semantic
  Representations in Naturalistic Images
StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images
Rushikesh Zawar
Shaurya Dewan
Andrew F. Luo
Margaret M. Henderson
Michael J. Tarr
Leila Wehbe
VGen
CoGe
49
1
0
19 Jun 2024
Leveraging Large Language Models for Patient Engagement: The Power of
  Conversational AI in Digital Health
Leveraging Large Language Models for Patient Engagement: The Power of Conversational AI in Digital Health
Bo Wen
R. Norel
Julia Liu
Thaddeus Stappenbeck
F. Zulkernine
Huamin Chen
AI4MH
LM&MA
45
2
0
19 Jun 2024
4K4DGen: Panoramic 4D Generation at 4K Resolution
4K4DGen: Panoramic 4D Generation at 4K Resolution
Renjie Li
Panwang Pan
Bangbang Yang
Dejia Xu
Shijie Zhou
Xuanyang Zhang
Zeming Li
A. Kadambi
Zhangyang Wang
Zhiwen Fan
VGen
68
17
0
19 Jun 2024
Neural Residual Diffusion Models for Deep Scalable Vision Generation
Neural Residual Diffusion Models for Deep Scalable Vision Generation
Zhiyuan Ma
Liangliang Zhao
Biqing Qi
Bowen Zhou
DiffM
90
2
0
19 Jun 2024
Conditional score-based diffusion models for solving inverse problems in
  mechanics
Conditional score-based diffusion models for solving inverse problems in mechanics
Agnimitra Dasgupta
Harisankar Ramaswamy
Javier Murgoitio-Esandi
Ken Foo
Runze Li
Qifa Zhou
Brendan Kennedy
Assad A. Oberai
DiffM
MedIm
52
2
0
19 Jun 2024
Evaluating the design space of diffusion-based generative models
Evaluating the design space of diffusion-based generative models
Yuqing Wang
Ye He
Molei Tao
DiffM
56
5
0
18 Jun 2024
Training Diffusion Models with Federated Learning
Training Diffusion Models with Federated Learning
Matthijs de Goede
Bart Cox
Jérémie Decouchant
FedML
54
9
0
18 Jun 2024
Variational Distillation of Diffusion Policies into Mixture of Experts
Variational Distillation of Diffusion Policies into Mixture of Experts
Hongyi Zhou
Denis Blessing
Ge Li
Onur Celik
Xiaogang Jia
Gerhard Neumann
Rudolf Lioutikov
DiffM
57
3
0
18 Jun 2024
Effective Generation of Feasible Solutions for Integer Programming via
  Guided Diffusion
Effective Generation of Feasible Solutions for Integer Programming via Guided Diffusion
Hao Zeng
Jiaqi Wang
Avirup Das
Junying He
Kunpeng Han
Haoyuan Hu
Mingfei Sun
55
1
0
18 Jun 2024
COT Flow: Learning Optimal-Transport Image Sampling and Editing by
  Contrastive Pairs
COT Flow: Learning Optimal-Transport Image Sampling and Editing by Contrastive Pairs
Xinrui Zu
Qian Tao
OT
DiffM
38
0
0
17 Jun 2024
Learning Molecular Representation in a Cell
Learning Molecular Representation in a Cell
Gang Liu
Srijit Seal
John Arevalo
Zhenwen Liang
Anne E Carpenter
Meng Jiang
Shantanu Singh
53
3
0
17 Jun 2024
ARTIST: Improving the Generation of Text-rich Images by Disentanglement
ARTIST: Improving the Generation of Text-rich Images by Disentanglement
Jianyi Zhang
Yufan Zhou
Jiuxiang Gu
Curtis Wigington
Tong Yu
Yiran Chen
Tong Sun
Ruiyi Zhang
79
0
0
17 Jun 2024
Exploring the Role of Large Language Models in Prompt Encoding for
  Diffusion Models
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models
Bingqi Ma
Zhuofan Zong
Guanglu Song
Hongsheng Li
Yu Liu
47
21
0
17 Jun 2024
Latent Denoising Diffusion GAN: Faster sampling, Higher image quality
Latent Denoising Diffusion GAN: Faster sampling, Higher image quality
Luan Thanh Trinh
T. Hamagami
DiffM
47
5
0
17 Jun 2024
Holistic-Motion2D: Scalable Whole-body Human Motion Generation in 2D
  Space
Holistic-Motion2D: Scalable Whole-body Human Motion Generation in 2D Space
Yuan Wang
Zhao Wang
Junhao Gong
Di Huang
Tong He
...
J. Jiao
Xuetao Feng
Qi Dou
Shixiang Tang
Dan Xu
55
3
0
17 Jun 2024
SUGARCREPE++ Dataset: Vision-Language Model Sensitivity to Semantic and
  Lexical Alterations
SUGARCREPE++ Dataset: Vision-Language Model Sensitivity to Semantic and Lexical Alterations
Sri Harsha Dumpala
Aman Jaiswal
Chandramouli Shama Sastry
E. Milios
Sageev Oore
Hassan Sajjad
CoGe
47
10
0
17 Jun 2024
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Robert Honig
Javier Rando
Nicholas Carlini
Florian Tramèr
WIGM
AAML
65
17
0
17 Jun 2024
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
Shengkang Wang
Hongzhan Lin
Ziyang Luo
Zhen Ye
Guang Chen
Jing Ma
76
3
0
17 Jun 2024
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models
Alireza Ganjdanesh
Reza Shirkavand
Shangqian Gao
Heng Huang
DiffM
VLM
63
4
0
17 Jun 2024
Leveraging Foundation Models for Multi-modal Federated Learning with
  Incomplete Modality
Leveraging Foundation Models for Multi-modal Federated Learning with Incomplete Modality
Liwei Che
Jiaqi Wang
Xinyue Liu
Fenglong Ma
33
3
0
16 Jun 2024
AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for
  Vision-Language Models
AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models
Xiyang Wu
Tianrui Guan
Dianqi Li
Shuaiyi Huang
Xiaoyu Liu
...
Abhinav Shrivastava
Furong Huang
Jordan L. Boyd-Graber
Dinesh Manocha
Dinesh Manocha
HILM
LRM
VLM
MLLM
45
14
0
16 Jun 2024
Self-Supervised Vision Transformer for Enhanced Virtual Clothes Try-On
Self-Supervised Vision Transformer for Enhanced Virtual Clothes Try-On
Lingxiao Lu
Shengyi Wu
Haoxuan Sun
Junhong Gou
Jianlou Si
Chen Qian
Jianfu Zhang
Liqing Zhang
ViT
DiffM
47
0
0
15 Jun 2024
Consistency-diversity-realism Pareto fronts of conditional image
  generative models
Consistency-diversity-realism Pareto fronts of conditional image generative models
Pietro Astolfi
Marlene Careil
Melissa Hall
Oscar Manas
Matthew Muckley
Jakob Verbeek
Adriana Romero Soriano
M. Drozdzal
59
10
0
14 Jun 2024
Group and Shuffle: Efficient Structured Orthogonal Parametrization
Group and Shuffle: Efficient Structured Orthogonal Parametrization
Mikhail Gorbunov
Nikolay Yudin
Vera Soboleva
Aibek Alanov
Alexey Naumov
Maxim Rakhuba
52
1
0
14 Jun 2024
PID: Prompt-Independent Data Protection Against Latent Diffusion Models
PID: Prompt-Independent Data Protection Against Latent Diffusion Models
Ang Li
Yichuan Mo
Mingjie Li
Yisen Wang
AAML
51
2
0
14 Jun 2024
Alleviating Distortion in Image Generation via Multi-Resolution
  Diffusion Models
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models
Qihao Liu
Zhanpeng Zeng
Ju He
Qihang Yu
Xiaohui Shen
Liang-Chieh Chen
58
21
0
13 Jun 2024
Rethinking Score Distillation as a Bridge Between Image Distributions
Rethinking Score Distillation as a Bridge Between Image Distributions
David McAllister
Songwei Ge
Jia-Bin Huang
David W. Jacobs
Alexei A. Efros
Aleksander Holyñski
Angjoo Kanazawa
DiffM
66
14
0
13 Jun 2024
Real-Time Deepfake Detection in the Real-World
Real-Time Deepfake Detection in the Real-World
Bar Cavia
Eliahu Horwitz
Tal Reiss
Yedid Hoshen
74
6
0
13 Jun 2024
SimGen: Simulator-conditioned Driving Scene Generation
SimGen: Simulator-conditioned Driving Scene Generation
Yunsong Zhou
Michael Simon
Zhenghao Peng
Sicheng Mo
Hongzi Zhu
Minyi Guo
Bolei Zhou
VGen
58
11
0
13 Jun 2024
CMC-Bench: Towards a New Paradigm of Visual Signal Compression
CMC-Bench: Towards a New Paradigm of Visual Signal Compression
Chunyi Li
Xiele Wu
H. Wu
Donghui Feng
Zicheng Zhang
Guo Lu
Xiongkuo Min
Xiaohong Liu
Guangtao Zhai
Weisi Lin
VLM
51
5
0
13 Jun 2024
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven
  Text-to-Image Generation
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation
Yufan Zhou
Ruiyi Zhang
Kaizhi Zheng
Nanxuan Zhao
Jiuxiang Gu
Zichao Wang
Xin Eric Wang
Tong Sun
DiffM
35
2
0
13 Jun 2024
Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image
  Diffusion Models
Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models
Ziyi Wu
Yulia Rubanova
Rishabh Kabra
Drew A. Hudson
Igor Gilitschenski
Yusuf Aytar
Sjoerd van Steenkiste
Kelsey R. Allen
Thomas Kipf
VGen
DiffM
62
9
0
13 Jun 2024
Language-driven Grasp Detection
Language-driven Grasp Detection
An Dinh Vuong
Minh Nhat Vu
Baoru Huang
Nghia Nguyen
Hieu Le
T. Vo
Anh Nguyen
VLM
51
19
0
13 Jun 2024
MirrorCheck: Efficient Adversarial Defense for Vision-Language Models
MirrorCheck: Efficient Adversarial Defense for Vision-Language Models
Samar Fares
Klea Ziu
Toluwani Aremu
Nikita Durasov
Martin Takáč
Pascal Fua
Karthik Nandakumar
Ivan Laptev
VLM
AAML
45
4
0
13 Jun 2024
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal
  Prompts
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Yucheng Han
Rui Wang
Chi Zhang
Juntao Hu
Pei Cheng
Bin-Bin Fu
Hanwang Zhang
77
6
0
13 Jun 2024
COVE: Unleashing the Diffusion Feature Correspondence for Consistent
  Video Editing
COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing
Jiangshan Wang
Yue Ma
Jiayi Guo
Yicheng Xiao
Gao Huang
Xiu Li
DiffM
49
19
0
13 Jun 2024
Generating Speakers by Prompting Listener Impressions for Pre-trained
  Multi-Speaker Text-to-Speech Systems
Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
Zhengyang Chen
Xuechen Liu
Erica Cooper
Junichi Yamagishi
Yanmin Qian
59
1
0
13 Jun 2024
FouRA: Fourier Low Rank Adaptation
FouRA: Fourier Low Rank Adaptation
Shubhankar Borse
Shreya Kadambi
N. Pandey
Kartikeya Bhardwaj
Viswanath Ganapathy
Sweta Priyadarshi
Risheek Garrepalli
Rafael Esteves
Munawar Hayat
Fatih Porikli
47
7
0
13 Jun 2024
RL-JACK: Reinforcement Learning-powered Black-box Jailbreaking Attack
  against LLMs
RL-JACK: Reinforcement Learning-powered Black-box Jailbreaking Attack against LLMs
Xuan Chen
Yuzhou Nie
Lu Yan
Yunshu Mao
Wenbo Guo
Xiangyu Zhang
36
7
0
13 Jun 2024
Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt
  Optimization for Enhanced Text-to-Image Synthesis
Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt Optimization for Enhanced Text-to-Image Synthesis
Xinrui Yang
Zhuohan Wang
Anthony Hu
EGVM
64
0
0
13 Jun 2024
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
Xuannan Liu
Zekun Li
Peipei Li
Shuhan Xia
Xing Cui
Linzhi Huang
Huaibo Huang
Weihong Deng
Zhaofeng He
66
19
0
13 Jun 2024
FakeInversion: Learning to Detect Images from Unseen Text-to-Image
  Models by Inverting Stable Diffusion
FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion
George Cazenavette
Avneesh Sud
Thomas Leung
Ben Usman
DiffM
39
14
0
12 Jun 2024
DiTFastAttn: Attention Compression for Diffusion Transformer Models
DiTFastAttn: Attention Compression for Diffusion Transformer Models
Zhihang Yuan
Pu Lu
Hanling Zhang
Xuefei Ning
Linfeng Zhang
Tianchen Zhao
Shengen Yan
Guohao Dai
Yu Wang
55
24
0
12 Jun 2024
What If We Recaption Billions of Web Images with LLaMA-3?
What If We Recaption Billions of Web Images with LLaMA-3?
Xianhang Li
Haoqin Tu
Mude Hui
Zeyu Wang
Bingchen Zhao
...
Jieru Mei
Qing Liu
Huangjie Zheng
Yuyin Zhou
Cihang Xie
VLM
MLLM
51
36
0
12 Jun 2024
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous
  Preferences
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences
Daiwei Chen
Yi Chen
Aniket Rege
Ramya Korlakai Vinayak
61
18
0
12 Jun 2024
From a Social Cognitive Perspective: Context-aware Visual Social
  Relationship Recognition
From a Social Cognitive Perspective: Context-aware Visual Social Relationship Recognition
Shiwei Wu
Chao Zhang
Joya Chen
Tong Xu
Likang Wu
Yao Hu
Enhong Chen
37
0
0
12 Jun 2024
Previous
123...252627...949596
Next