Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.14491
Cited By
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
29 September 2022
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Re-Imagen: Retrieval-Augmented Text-to-Image Generator"
50 / 141 papers shown
Title
Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation
Fulvio Sanguigni
Davide Morelli
Marcella Cornia
Rita Cucchiara
DiffM
35
0
0
18 Apr 2025
Personalized Text-to-Image Generation with Auto-Regressive Models
Kaiyue Sun
Xian Liu
Yao Teng
Xihui Liu
33
0
0
17 Apr 2025
ICAS: IP Adapter and ControlNet-based Attention Structure for Multi-Subject Style Transfer Optimization
Fuwei Liu
DiffM
36
0
0
17 Apr 2025
VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents
Ryota Tanaka
Taichi Iki
Taku Hasegawa
Kyosuke Nishida
Kuniko Saito
Jun Suzuki
VLM
47
0
0
14 Apr 2025
Flux Already Knows -- Activating Subject-Driven Image Generation without Training
Hao Kang
Stathi Fotiadis
Liming Jiang
Qing Yan
Yumin Jia
Zichuan Liu
Min Jin Chong
Xin Lu
35
0
0
12 Apr 2025
Leveraging LLMs for Multimodal Retrieval-Augmented Radiology Report Generation via Key Phrase Extraction
Kyoyun Choi
Byungmu Yoon
Soobum Kim
Jonggwon Park
33
0
0
10 Apr 2025
Transfer between Modalities with MetaQueries
Xichen Pan
Satya Narayan Shukla
Aashu Singh
Zhuokai Zhao
Shlok Kumar Mishra
...
Jiuhai Chen
Kunpeng Li
F. Xu
Ji Hou
Saining Xie
DiffM
41
6
0
08 Apr 2025
Multi-party Collaborative Attention Control for Image Customization
Han Yang
Chuanguang Yang
Qiuli Wang
Zhulin An
Weilun Feng
Libo Huang
Y. Xu
DiffM
30
0
0
02 Apr 2025
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Shaojin Wu
Mengqi Huang
Wenxu Wu
Yufeng Cheng
Fei Ding
Qian He
DiffM
50
4
0
02 Apr 2025
TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models
Teng-Fang Hsiao
Bo-Kai Ruan
Yi-Lun Wu
Tzu-Ling Lin
Hong-Han Shuai
VLM
48
0
0
19 Mar 2025
MES-RAG: Bringing Multi-modal, Entity-Storage, and Secure Enhancements to RAG
Pingyu Wu
Daiheng Gao
Jing Tang
Huimin Chen
Wenbo Zhou
W. Zhang
Nenghai Yu
42
0
0
17 Mar 2025
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Arsh Koneru
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VLM
58
1
0
15 Mar 2025
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models
Jialv Zou
Bencheng Liao
Qian Zhang
Wenyu Liu
Xinggang Wang
Mamba
MLLM
82
1
0
11 Mar 2025
Maximizing Signal in Human-Model Preference Alignment
Kelsey Kraus
Margaret Kroll
ALM
50
0
0
06 Mar 2025
WeGen: A Unified Model for Interactive Multimodal Generation as We Chat
Zhipeng Huang
Shaobin Zhuang
Canmiao Fu
Binxin Yang
Ying Zhang
Chong Sun
Zhizheng Zhang
Yali Wang
Chen Li
Zheng-Jun Zha
DiffM
69
1
0
03 Mar 2025
RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings
A. Dhakal
S. Sastry
Subash Khanal
Adeel Ahmad
Eric Xing
Nathan Jacobs
50
0
0
27 Feb 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Mohammad Mahdi Abootorabi
Amirhosein Zobeiri
Mahdi Dehghani
Mohammadali Mohammadkhani
Bardia Mohammadi
Omid Ghahroodi
M. Baghshah
Ehsaneddin Asgari
RALM
98
4
0
12 Feb 2025
Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge
Daniel Tamayo
Aitor Gonzalez-Agirre
Javier Hernando
Marta Villegas
KELM
85
3
0
04 Feb 2025
RealCustom++: Representing Images as Real-Word for Real-Time Customization
Zhendong Mao
Mengqi Huang
Fei Ding
Mingcong Liu
Qian He
Xiaojun Chang
DiffM
72
6
0
03 Jan 2025
RA-SGG: Retrieval-Augmented Scene Graph Generation Framework via Multi-Prototype Learning
Kanghoon Yoon
Kibum Kim
Jaehyung Jeon
Yeonjun In
Donghyun Kim
Chanyoung Park
69
1
0
17 Dec 2024
Accelerating Retrieval-Augmented Generation
Derrick Quinn
Mohammad Nouri
Neel Patel
John Salihu
Alireza Salemi
Sukhan Lee
Hamed Zamani
Mohammad Alian
RALM
3DV
85
3
0
14 Dec 2024
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics
Xi Chen
Zhifei Zhang
He Zhang
Yuqian Zhou
S. Kim
...
Nanxuan Zhao
Yilin Wang
Hui Ding
Zhe Lin
Hengshuang Zhao
VGen
DiffM
121
21
0
10 Dec 2024
[MASK] is All You Need
Vincent Tao Hu
Bjorn Ommer
DiffM
135
2
0
09 Dec 2024
Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing
Wenyi Mo
Tianyu Zhang
Yalong Bai
Bing-Huang Su
Ji-Rong Wen
DiffM
71
0
0
29 Nov 2024
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation
Seongsu Ha
Chaeyun Kim
Donghwa Kim
Junho Lee
Sangho Lee
Joonseok Lee
45
2
0
03 Nov 2024
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Weizhe Lin
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Miao Liu
Per Ola Kristensson
Junxiao Shen
77
2
0
01 Nov 2024
Offline Evaluation of Set-Based Text-to-Image Generation
Negar Arabzadeh
Fernando Diaz
Junfeng He
EGVM
27
0
0
22 Oct 2024
Retrieval Augmented Diffusion Model for Structure-informed Antibody Design and Optimization
Zichen Wang
Yaokun Ji
Jianing Tian
Shuangjia Zheng
DiffM
30
0
0
19 Oct 2024
KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities
Hsin-Ping Huang
X. Wang
Yonatan Bitton
Hagai Taitelbaum
Gaurav Singh Tomar
...
Xuhui Jia
Kelvin Chan
Hexiang Hu
Yu-Chuan Su
Ming Yang
EGVM
64
4
0
15 Oct 2024
Data Extrapolation for Text-to-image Generation on Small Datasets
Senmao Ye
Fei Liu
28
0
0
02 Oct 2024
Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering
Youngsun Lim
Hojun Choi
Hyunjung Shim
HILM
EGVM
MLLM
35
0
0
19 Sep 2024
MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion
Kalakonda Sai Shashank
Shubh Maheshwari
Ravi Kiran Sarvadevabhatla
VGen
DiffM
22
1
0
18 Sep 2024
OmniGen: Unified Image Generation
Shitao Xiao
Yueze Wang
Junjie Zhou
Huaying Yuan
Xingrun Xing
Ruiran Yan
Shuting Wang
Tiejun Huang
Zheng Liu
DiffM
VLM
SyDa
50
62
0
17 Sep 2024
Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques
Davide Clode da Silva
Marina Musse Bernardes
Nathalia Giacomini Ceretta
Gabriel Vaz de Souza
Gabriel Fonseca Silva
Rafael Heitor Bordini
S. Musse
MedIm
LM&MA
23
0
0
06 Sep 2024
Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation
Cephas Mpungu
Qiyuan Chen
Wei Wei
Jiashuo Sun
G. Mapp
VLM
RALM
LRM
22
16
0
01 Aug 2024
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities
To Eun Kim
Alireza Salemi
Andrew Drozdov
Fernando Diaz
Hamed Zamani
48
7
0
17 Jul 2024
Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning
Yanting Miao
William Loh
Suraj Kothawade
Pascal Poupart
Abdullah Rashwan
Yeqing Li
EGVM
47
1
0
16 Jul 2024
Addressing Image Hallucination in Text-to-Image Generation through Factual Image Retrieval
Youngsun Lim
Hyunjung Shim
DiffM
HILM
MQ
35
3
0
15 Jul 2024
VIMI: Grounding Video Generation through Multi-modal Instruction
Yuwei Fang
Willi Menapace
Aliaksandr Siarohin
Tsai-Shien Chen
Kuan-Chien Wang
Ivan Skorokhodov
Graham Neubig
Sergey Tulyakov
VGen
58
2
0
08 Jul 2024
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Dewei Zhou
Y. Li
Fan Ma
Zongxin Yang
Y. Yang
91
11
0
02 Jul 2024
RAVEN: Multitask Retrieval Augmented Vision-Language Learning
Varun Nagaraj Rao
Siddharth Choudhary
Aditya Deshpande
R. Satzoda
Srikar Appalaraju
RALM
VLM
50
4
0
27 Jun 2024
Light Up the Shadows: Enhance Long-Tailed Entity Grounding with Concept-Guided Vision-Language Models
Yikai Zhang
Qianyu He
Xintao Wang
Siyu Yuan
Jiaqing Liang
Yanghua Xiao
VLM
29
0
0
16 Jun 2024
Consistency-diversity-realism Pareto fronts of conditional image generative models
Pietro Astolfi
Marlene Careil
Melissa Hall
Oscar Manas
Matthew Muckley
Jakob Verbeek
Adriana Romero Soriano
M. Drozdzal
51
10
0
14 Jun 2024
ControlVAR: Exploring Controllable Visual Autoregressive Modeling
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Zhe-nan Lin
Rita Singh
Bhiksha Raj
DiffM
43
21
0
14 Jun 2024
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation
Yufan Zhou
Ruiyi Zhang
Kaizhi Zheng
Nanxuan Zhao
Jiuxiang Gu
Zichao Wang
Xin Eric Wang
Tong Sun
DiffM
29
2
0
13 Jun 2024
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen
Puyuan Peng
Ami Baid
Zihui Xue
Wei-Ning Hsu
David F. Harwath
Kristen Grauman
VGen
37
7
0
13 Jun 2024
ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance
Jiannan Huang
Jun Hao Liew
Hanshu Yan
Yuyang Yin
Yao Zhao
Yunchao Wei
Yunchao Wei
DiffM
90
6
0
27 May 2024
Towards Retrieval-Augmented Architectures for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Alessandro Nicolosi
Rita Cucchiara
VLM
19
9
0
21 May 2024
Compositional Text-to-Image Generation with Dense Blob Representations
Weili Nie
Sifei Liu
Morteza Mardani
Chao Liu
Benjamin Eckart
Arash Vahdat
DiffM
78
17
0
14 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
36
2
0
11 May 2024
1
2
3
Next