ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,756 papers shown
Title
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Alan Dao
Dinh Bach Vu
Huy Hoang Ha
AuLLM
VLM
73
3
0
20 Oct 2024
Group Diffusion Transformers are Unsupervised Multitask Learners
Group Diffusion Transformers are Unsupervised Multitask Learners
Lianghua Huang
Wei Wang
Zhi-Fan Wu
Huanzhang Dou
Yupeng Shi
Yutong Feng
C. Liang
Yu Liu
Jingren Zhou
VLM
52
12
0
19 Oct 2024
"Confrontation or Acceptance": Understanding Novice Visual Artists'
  Perception towards AI-assisted Art Creation
"Confrontation or Acceptance": Understanding Novice Visual Artists' Perception towards AI-assisted Art Creation
Shuning Zhang
Shixuan Li
33
1
0
19 Oct 2024
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher
  in One Step
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step
Mingyuan Zhou
Huangjie Zheng
Yi Gu
Zhendong Wang
Hai Huang
DiffM
58
7
0
19 Oct 2024
Truncated Consistency Models
Truncated Consistency Models
Sangyun Lee
Yilun Xu
Tomas Geffner
Giulia Fanti
Karsten Kreis
Arash Vahdat
Weili Nie
59
3
0
18 Oct 2024
Assistive AI for Augmenting Human Decision-making
Assistive AI for Augmenting Human Decision-making
Natabara Máté Gyöngyössy
Bernát Török
Csilla Farkas
Laura Lucaj
Attila Menyhárd
Krisztina Menyhárd-Balázs
András Simonyi
Patrick van der Smagt
Zsolt Ződi
András Lőrincz
41
0
0
18 Oct 2024
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image
  Generation
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation
Bo Cheng
Yuhang Ma
Liebucha Wu
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Dawei Leng
Yuhui Yin
DiffM
35
8
0
18 Oct 2024
ERDDCI: Exact Reversible Diffusion via Dual-Chain Inversion for
  High-Quality Image Editing
ERDDCI: Exact Reversible Diffusion via Dual-Chain Inversion for High-Quality Image Editing
Jimin Dai
Wenjie Qu
Shuo Chen
Jian Yang
Lei Luo
DiffM
31
0
0
18 Oct 2024
Text-to-Image Representativity Fairness Evaluation Framework
Text-to-Image Representativity Fairness Evaluation Framework
Asma Z. Yamani
Malak Baslyman
26
0
0
18 Oct 2024
Skill Generalization with Verbs
Skill Generalization with Verbs
Rachel Ma
Lyndon Lam
Benjamin A. Spiegel
Aditya Ganeshan
Roma Patel
Ben Abbatematteo
D. Paulius
Stefanie Tellex
George Konidaris
LM&Ro
72
2
0
18 Oct 2024
Assessing Open-world Forgetting in Generative Image Model Customization
Assessing Open-world Forgetting in Generative Image Model Customization
Héctor Laria
Alex Gomez-Villa
Imad Eddine Marouf
Bogdan Raducanu
Bogdan Raducanu
VLM
DiffM
42
0
0
18 Oct 2024
Fluid: Scaling Autoregressive Text-to-image Generative Models with
  Continuous Tokens
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Lijie Fan
Tianhong Li
Siyang Qin
Yuanzhen Li
Chen Sun
Michael Rubinstein
Deqing Sun
Kaiming He
Yonglong Tian
VLM
DiffM
53
43
0
17 Oct 2024
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation
Rongyao Fang
Chengqi Duan
Kun Wang
Hao Li
H. Tian
Xingyu Zeng
Rui Zhao
Jifeng Dai
Hongsheng Li
Xihui Liu
MLLM
41
11
0
17 Oct 2024
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding
  and Generation
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Chengyue Wu
Xiaokang Chen
Z. F. Wu
Yiyang Ma
Xingchao Liu
...
Wen Liu
Zhenda Xie
Xingkai Yu
Chong Ruan
Ping Luo
AI4TS
65
82
0
17 Oct 2024
Improved Convergence Rate for Diffusion Probabilistic Models
Improved Convergence Rate for Diffusion Probabilistic Models
Gen Li
Yuchen Jiao
DiffM
49
6
0
17 Oct 2024
Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning
  via Image-Guided Diffusion
Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion
Yijun Liang
Shweta Bhardwaj
Dinesh Manocha
49
0
0
17 Oct 2024
VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic
  Reasoning Tasks
VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Shailaja Keyur Sampat
Mutsumi Nakamura
Shankar Kailas
Kartik Aggarwal
Mandy Zhou
Yezhou Yang
Chitta Baral
MLLM
CoGe
ReLM
VLM
LRM
37
0
0
17 Oct 2024
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving
  Scene Representation
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
Guosheng Zhao
Chaojun Ni
Xiaofeng Wang
Zheng Zhu
Xinming Zhang
...
Xinze Chen
Boyuan Wang
Youyi Zhang
Wenjun Mei
Xingang Wang
VGen
83
24
0
17 Oct 2024
GraspDiffusion: Synthesizing Realistic Whole-body Hand-Object
  Interaction
GraspDiffusion: Synthesizing Realistic Whole-body Hand-Object Interaction
Patrick Kwon
Hanbyul Joo
36
3
0
17 Oct 2024
Diffusing States and Matching Scores: A New Framework for Imitation Learning
Diffusing States and Matching Scores: A New Framework for Imitation Learning
Runzhe Wu
Yiding Chen
Gokul Swamy
Kianté Brantley
Wen Sun
DiffM
53
3
0
17 Oct 2024
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
Donghao Zhou
Jiancheng Huang
J. Bai
Jiaze Wang
Hao Chen
Guangyong Chen
Xiaowei Hu
Pheng Ann Heng
50
5
0
17 Oct 2024
DreamCraft3D++: Efficient Hierarchical 3D Generation with Multi-Plane
  Reconstruction Model
DreamCraft3D++: Efficient Hierarchical 3D Generation with Multi-Plane Reconstruction Model
Jingxiang Sun
Cheng Peng
Ruizhi Shao
Y. Guo
Xiaochen Zhao
Yangguang Li
Yanpei Cao
Bo Zhang
Yebin Liu
50
2
0
16 Oct 2024
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via
  Lightweight Value Optimization
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
Xingqi Wang
Xiaoyuan Yi
Xing Xie
Jia Jia
26
1
0
16 Oct 2024
Constrained Posterior Sampling: Time Series Generation with Hard
  Constraints
Constrained Posterior Sampling: Time Series Generation with Hard Constraints
Sai Shankar Narasimhan
Shubhankar Agarwal
Litu Rout
Sanjay Shakkottai
Sandeep Chinchali
DiffM
AI4TS
38
0
0
16 Oct 2024
SDI-Paste: Synthetic Dynamic Instance Copy-Paste for Video Instance
  Segmentation
SDI-Paste: Synthetic Dynamic Instance Copy-Paste for Video Instance Segmentation
Sahir Shrestha
Weihao Li
Gao Zhu
Nick Barnes
DiffM
38
0
0
16 Oct 2024
Imagine2Servo: Intelligent Visual Servoing with Diffusion-Driven Goal
  Generation for Robotic Tasks
Imagine2Servo: Intelligent Visual Servoing with Diffusion-Driven Goal Generation for Robotic Tasks
Pranjali Pathre
Gunjan Gupta
M. N. Qureshi
Mandyam Brunda
Samarth Brahmbhatt
K. M. Krishna
VGen
34
0
0
16 Oct 2024
FaceChain-FACT: Face Adapter with Decoupled Training for
  Identity-preserved Personalization
FaceChain-FACT: Face Adapter with Decoupled Training for Identity-preserved Personalization
Cheng Yu
Haoyu Xie
Lei Shang
Yong-Jin Liu
Jun Dan
Liefeng Bo
Baigui Sun
28
2
0
16 Oct 2024
TV-3DG: Mastering Text-to-3D Customized Generation with Visual Prompt
TV-3DG: Mastering Text-to-3D Customized Generation with Visual Prompt
Jiahui Yang
Donglin Di
Baorui Ma
Xun Yang
Yongjia Ma
...
Wei Chen
Jianxun Cui
Zhou Xue
Meng Wang
Yebin Liu
DiffM
53
1
0
16 Oct 2024
TransAgent: Transfer Vision-Language Foundation Models with
  Heterogeneous Agent Collaboration
TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration
Yiwei Guo
Shaobin Zhuang
Kunchang Li
Yu Qiao
Yali Wang
VLM
CLIP
45
0
0
16 Oct 2024
Flex: End-to-End Text-Instructed Visual Navigation from Foundation Model Features
Flex: End-to-End Text-Instructed Visual Navigation from Foundation Model Features
Makram Chahine
Alex Quach
Alaa Maalouf
Tsun-Hsuan Wang
Daniela Rus
36
0
0
16 Oct 2024
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
Xuyuan Li
Zengqiang Shang
Hua Hua
Peiyang Shi
Chen Yang
Li Wang
Pengyuan Zhang
63
2
0
16 Oct 2024
KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual
  Entities
KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities
Hsin-Ping Huang
Xinyu Wang
Yonatan Bitton
Hagai Taitelbaum
Gaurav Singh Tomar
...
Xuhui Jia
Kelvin Chan
Hexiang Hu
Yu-Chuan Su
Ming-Hsuan Yang
EGVM
75
4
0
15 Oct 2024
Simultaneous Diffusion Sampling for Conditional LiDAR Generation
Simultaneous Diffusion Sampling for Conditional LiDAR Generation
Ryan Faulkner
Luke Haub
Simon Ratcliffe
Anh-Dzung Doan
Ian Reid
Tat-Jun Chin
35
0
0
15 Oct 2024
Enhancing Unimodal Latent Representations in Multimodal VAEs through
  Iterative Amortized Inference
Enhancing Unimodal Latent Representations in Multimodal VAEs through Iterative Amortized Inference
Yuta Oshima
Masahiro Suzuki
Y. Matsuo
38
0
0
15 Oct 2024
DRACO: A Denoising-Reconstruction Autoencoder for Cryo-EM
DRACO: A Denoising-Reconstruction Autoencoder for Cryo-EM
Yingjun Shen
Haizhao Dai
Qihe Chen
Yan Zeng
Jiakai Zhang
Yuan Pei
Jingyi Yu
26
0
0
15 Oct 2024
DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive
  Revaluation
DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation
Jaehyun Park
Yunho Kim
Sejin Kim
Byung-Jun Lee
Sundong Kim
OffRL
39
1
0
15 Oct 2024
Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC
  Task
Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task
Yunho Kim
Jaehyun Park
Heejun Kim
Sejin Kim
Byung-Jun Lee
Sundong Kim
OffRL
40
1
0
15 Oct 2024
Learning Diffusion Model from Noisy Measurement using Principled
  Expectation-Maximization Method
Learning Diffusion Model from Noisy Measurement using Principled Expectation-Maximization Method
Weimin Bai
Weiheng Tang
E. Ye
Siyi Chen
Wenzheng Chen
H. Sun
DiffM
23
1
0
15 Oct 2024
A Simple Approach to Unifying Diffusion-based Conditional Generation
A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li
Charles Herrmann
Kelvin C.K. Chan
Yinxiao Li
Deqing Sun
Chao Ma
Ming-Hsuan Yang
DiffM
VLM
51
1
0
15 Oct 2024
Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
Guiyu Zhang
Huan-ang Gao
Zijian Jiang
Hao Zhao
Zhedong Zheng
EGVM
57
6
0
15 Oct 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Luping Liu
Chao Du
Tianyu Pang
Zehan Wang
Chongxuan Li
Dong Xu
VLM
55
5
0
15 Oct 2024
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent
  Approach
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach
Rory Young
Nicolas Pugeault
AAML
65
0
0
14 Oct 2024
Saliency Guided Optimization of Diffusion Latents
Saliency Guided Optimization of Diffusion Latents
Xiwen Wang
Jizhe Zhou
Xuekang Zhu
Cheng Li
Mao Li
EGVM
28
0
0
14 Oct 2024
MagicEraser: Erasing Any Objects via Semantics-Aware Control
MagicEraser: Erasing Any Objects via Semantics-Aware Control
Fan Li
Zixiao Zhang
Yi Huang
Jianzhuang Liu
Renjing Pei
Bin Shao
Songcen Xu
DiffM
44
7
0
14 Oct 2024
TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
Weichao Zeng
Yan Shu
Zhenhang Li
Dongbao Yang
Yu Zhou
DiffM
29
7
0
14 Oct 2024
Learning to Customize Text-to-Image Diffusion In Diverse Context
Learning to Customize Text-to-Image Diffusion In Diverse Context
Taewook Kim
Wei Chen
Qiang Qiu
DiffM
43
2
0
14 Oct 2024
EBDM: Exemplar-guided Image Translation with Brownian-bridge Diffusion
  Models
EBDM: Exemplar-guided Image Translation with Brownian-bridge Diffusion Models
Eungbean Lee
Somi Jeong
Kwanghoon Sohn
DiffM
35
1
0
13 Oct 2024
Generating Intermediate Representations for Compositional Text-To-Image
  Generation
Generating Intermediate Representations for Compositional Text-To-Image Generation
Ran Galun
Sagie Benaim
25
0
0
13 Oct 2024
Large Model for Small Data: Foundation Model for Cross-Modal RF Human
  Activity Recognition
Large Model for Small Data: Foundation Model for Cross-Modal RF Human Activity Recognition
Yuxuan Weng
Guoquan Wu
Tianyue Zheng
Yanbing Yang
Jun Luo
40
5
0
13 Oct 2024
Provable Convergence and Limitations of Geometric Tempering for Langevin Dynamics
Provable Convergence and Limitations of Geometric Tempering for Langevin Dynamics
Omar Chehab
Anna Korba
Austin Stromme
Adrien Vacher
40
3
0
13 Oct 2024
Previous
123...141516...949596
Next