ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.17669
  4. Cited By
TDRI: Two-Phase Dialogue Refinement and Co-Adaptation for Interactive Image Generation
v1v2 (latest)

TDRI: Two-Phase Dialogue Refinement and Co-Adaptation for Interactive Image Generation

22 March 2025
Yuheng Feng
Jianhui Wang
Kun Li
Sida Li
Tianyu Shi
Haoyue Han
Miao Zhang
Xueqian Wang
    DiffM
ArXiv (abs)PDFHTML

Papers citing "TDRI: Two-Phase Dialogue Refinement and Co-Adaptation for Interactive Image Generation"

50 / 66 papers shown
Title
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark
Yi Xin
Jianjiang Yang
Haodi Zhou
Junlong Du
Qi Qin
...
Bin Fu
Xiaokang Yang
Guangtao Zhai
Ming-Hsuan Yang
Xiaohong Liu
VLM
174
86
0
01 Jul 2025
Optimized Path Planning for Logistics Robots Using Ant Colony Algorithm under Multiple Constraints
Optimized Path Planning for Logistics Robots Using Ant Colony Algorithm under Multiple Constraints
Haopeng Zhao
Zhichao Ma
Lipeng Liu
Yang Wang
Zheyu Zhang
Hao Liu
69
10
0
06 Apr 2025
PromptLNet: Region-Adaptive Aesthetic Enhancement via Prompt Guidance in Low-Light Enhancement Net
Jun Yin
Yangfan He
Miao Zhang
Pengyu Zeng
Tianyi Wang
Shuai Lu
Xueqian Wang
DiffM
145
7
0
11 Mar 2025
A Cascading Cooperative Multi-agent Framework for On-ramp Merging Control Integrating Large Language Models
Miao Zhang
Zhenlong Fang
Tianyi Wang
Qin Zhang
Shuai Lu
Junfeng Jiao
Tianyu Shi
AI4CE
120
5
0
11 Mar 2025
TSCnet: A Text-driven Semantic-level Controllable Framework for Customized Low-Light Image Enhancement
Miao Zhang
Jun Yin
Pengyu Zeng
Yiqing Shen
Shuai Lu
Xueqian Wang
DiffM
167
14
0
11 Mar 2025
Research on Enhancing Cloud Computing Network Security using Artificial Intelligence Algorithms
Research on Enhancing Cloud Computing Network Security using Artificial Intelligence Algorithms
Yuqing Wang
Xiao Yang
75
5
0
25 Feb 2025
Design and implementation of a distributed security threat detection system integrating federated learning and multimodal LLM
Design and implementation of a distributed security threat detection system integrating federated learning and multimodal LLM
Yuqing Wang
Xiao Yang
110
5
0
25 Feb 2025
Zero-Shot End-to-End Relation Extraction in Chinese: A Comparative Study of Gemini, LLaMA and ChatGPT
Zero-Shot End-to-End Relation Extraction in Chinese: A Comparative Study of Gemini, LLaMA and ChatGPT
Shaoshuai Du
Yiyi Tao
Yixian Shen
Hang Zhang
Yanxin Shen
Xinyu Qiu
Chuanqi Shi
118
7
0
08 Feb 2025
AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility
Yixian Shen
Hang Zhang
Yanxin Shen
Lun Wang
Chuanqi Shi
Shaoshuai Du
Yiyi Tao
103
8
0
03 Jan 2025
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Ling Fu
Biao Yang
Zhebin Kuang
Jiajun Song
Yuzhe Li
...
Jingqun Tang
Wei Chen
Lianwen Jin
Yunxing Liu
Xiang Bai
115
22
0
31 Dec 2024
Enhancing Code LLMs with Reinforcement Learning in Code Generation: A Survey
Enhancing Code LLMs with Reinforcement Learning in Code Generation: A Survey
Junqiao Wang
Zeng Zhang
Yangfan He
Yuyang Song
Tianyu Shi
...
Tang Jingqun
Guangwu Qian
Keqin Li
Qiuwu Chen
Lewei He
144
22
0
29 Dec 2024
EasyTime: Time Series Forecasting Made Easy
EasyTime: Time Series Forecasting Made Easy
Xiangfei Qiu
Xiuwen Li
Ruiyang Pang
Zhicheng Pan
Xiaojun Wu
...
Chengcheng Yang
Chenjuan Guo
Aoying Zhou
Christian S. Jensen
Bin Yang
AI4TS
127
16
0
23 Dec 2024
Robustness of Large Language Models Against Adversarial Attacks
Robustness of Large Language Models Against Adversarial Attacks
Yiyi Tao
Yixian Shen
Hang Zhang
Yanxin Shen
Lun Wang
Chuanqi Shi
Shaoshuai Du
AAML
110
9
0
22 Dec 2024
Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance
Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance
Wenhao Sun
Benlei Cui
Xue-Mei Dong
Jingqun Tang
DiffM
216
14
0
17 Dec 2024
MCTBench: Multimodal Cognition towards Text-Rich Visual Scenes Benchmark
MCTBench: Multimodal Cognition towards Text-Rich Visual Scenes Benchmark
Bin Shan
Xiang Fei
Wei Shi
An-Lan Wang
Guozhi Tang
Lei Liao
Jingqun Tang
Xiang Bai
Can Huang
VLM
89
7
0
15 Oct 2024
ParGo: Bridging Vision-Language with Partial and Global Views
ParGo: Bridging Vision-Language with Partial and Global Views
An-Lan Wang
Bin Shan
Wei Shi
Kun-Yu Lin
Xiang Fei
Guozhi Tang
Lei Liao
Jingqun Tang
Can Huang
Wei-Shi Zheng
MLLMVLM
178
17
0
23 Aug 2024
Harmonizing Visual Text Comprehension and Generation
Harmonizing Visual Text Comprehension and Generation
Zhen Zhao
Jingqun Tang
Binghong Wu
Chunhui Lin
Shubo Wei
Hao Liu
Xin Tan
Zhizhong Zhang
Can Huang
Yuan Xie
VLM
105
26
0
23 Jul 2024
Autonomous Navigation of Unmanned Vehicle Through Deep Reinforcement
  Learning
Autonomous Navigation of Unmanned Vehicle Through Deep Reinforcement Learning
Letian Xu
Jiabei Liu
Haopeng Zhao
Tianyao Zheng
Tongzhou Jiang
Lipeng Liu
96
14
0
18 Jul 2024
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
Jinghui Lu
Haiyang Yu
Yanjie Wang
Yongjie Ye
Jingqun Tang
...
Qi Liu
Hao Feng
Han Wang
Hao Liu
Can Huang
171
23
0
02 Jul 2024
TabPedia: Towards Comprehensive Visual Table Understanding with Concept
  Synergy
TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
Weichao Zhao
Hao Feng
Qi Liu
Jingqun Tang
Shubo Wei
...
Lei Liao
Yongjie Ye
Hao Liu
Houqiang Li
Can Huang
LMTD
96
24
0
03 Jun 2024
$\textit{Comet:}$ A $\underline{Com}$munication-$\underline{e}$fficient
  and Performant Approxima$\underline{t}$ion for Private Transformer Inference
Comet:\textit{Comet:}Comet: A Com‾\underline{Com}Com​munication-e‾\underline{e}e​fficient and Performant Approximat‾\underline{t}t​ion for Private Transformer Inference
Xiangrui Xu
Qiao Zhang
R. Ning
Chunsheng Xin
Hongyi Wu
78
6
0
24 May 2024
Towards Understanding the Working Mechanism of Text-to-Image Diffusion
  Model
Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model
Mingyang Yi
Aoxue Li
Yi Xin
Zhenguo Li
DiffM
130
13
0
24 May 2024
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
Jingqun Tang
Qi-dong Liu
Yongjie Ye
Jinghui Lu
Shubo Wei
...
Hao Liu
Xiang Bai
Can Huang
Xiang Bai
Can Huang
183
28
0
20 May 2024
DDPM-MoCo: Advancing Industrial Surface Defect Generation and Detection
  with Generative and Contrastive Learning
DDPM-MoCo: Advancing Industrial Surface Defect Generation and Detection with Generative and Contrastive Learning
Yangfan He
Xinyan Wang
Tianyu Shi
103
6
0
09 May 2024
TextSquare: Scaling up Text-Centric Visual Instruction Tuning
TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Jingqun Tang
Chunhui Lin
Zhen Zhao
Shubo Wei
Binghong Wu
...
Yuliang Liu
Xiang Bai
Can Huang
Xiang Bai
Can Huang
LRMVLMMLLM
182
30
0
19 Apr 2024
WcDT: World-centric Diffusion Transformer for Traffic Scene Generation
WcDT: World-centric Diffusion Transformer for Traffic Scene Generation
Chen Yang
Aaron Xuxiang Tian
Dong Chen
Tianyu Shi
Arsalan Heydarian
Tianyu Shi
Arsalan Heydarian
Pei Liu
132
10
0
02 Apr 2024
CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models
CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models
Xuechen Liang
Meiling Tao
Yinghui Xia
Yiting Xie
Jun Wang
JingSong Yang
LLMAG
166
14
0
02 Apr 2024
TFB: Towards Comprehensive and Fair Benchmarking of Time Series
  Forecasting Methods
TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods
Xiangfei Qiu
Jilin Hu
Lekui Zhou
Xingjian Wu
Junyang Du
...
Chenjuan Guo
Aoying Zhou
Christian S. Jensen
Zhenli Sheng
Bin Yang
AI4TS
139
86
0
29 Mar 2024
GAgent: An Adaptive Rigid-Soft Gripping Agent with Vision Language
  Models for Complex Lighting Environments
GAgent: An Adaptive Rigid-Soft Gripping Agent with Vision Language Models for Complex Lighting Environments
Zhuowei Li
Miao Zhang
Xiaotian Lin
Meng Yin
Shuai Lu
Xueqian Wang
95
6
0
16 Mar 2024
CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion
CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion
Wendi Zheng
Jiayan Teng
Zhuoyi Yang
Weihan Wang
Jidong Chen
Xiaotao Gu
Yuxiao Dong
Ming Ding
Jie Tang
DiffM
101
41
0
08 Mar 2024
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
...
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
321
1,410
0
05 Mar 2024
Panacea: Pareto Alignment via Preference Adaptation for LLMs
Panacea: Pareto Alignment via Preference Adaptation for LLMs
Yifan Zhong
Chengdong Ma
Xiaoyuan Zhang
Ziran Yang
Haojun Chen
Qingfu Zhang
Siyuan Qi
Yaodong Yang
130
38
0
03 Feb 2024
Rich Human Feedback for Text-to-Image Generation
Rich Human Feedback for Text-to-Image Generation
Youwei Liang
Junfeng He
Gang Li
Peizhao Li
Arseniy Klimovskiy
...
Yiwen Luo
Yang Li
Kai Kohlhoff
Deepak Ramachandran
Vidhya Navalpakkam
EGVM
83
86
0
15 Dec 2023
VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense
  Scene Understanding
VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding
Yi Xin
Junlong Du
Qiang Wang
Zhiwen Lin
Ke Yan
VPVLM
160
54
0
14 Dec 2023
MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning
MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning
Yi Xin
Junlong Du
Qiang Wang
Ke Yan
Shouhong Ding
VLM
100
52
0
14 Dec 2023
Multi-modal In-Context Learning Makes an Ego-evolving Scene Text
  Recognizer
Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer
Zhen Zhao
Jingqun Tang
Chunhui Lin
Binghong Wu
Can Huang
Hao Liu
Xin Tan
Zhizhong Zhang
Yuan Xie
104
25
0
22 Nov 2023
DocPedia: Unleashing the Power of Large Multimodal Model in the
  Frequency Domain for Versatile Document Understanding
DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding
Hao Feng
Qi Liu
Hao Liu
Wen-gang Zhou
Houqiang Li
Can Huang
VLM
115
67
0
20 Nov 2023
Holistic Evaluation of Text-To-Image Models
Holistic Evaluation of Text-To-Image Models
Tony Lee
Michihiro Yasunaga
Chenlin Meng
Yifan Mai
Joon Sung Park
...
Jun-Yan Zhu
Fei-Fei Li
Jiajun Wu
Stefano Ermon
Percy Liang
236
139
0
07 Nov 2023
Text-to-Image Generation for Abstract Concepts
Text-to-Image Generation for Abstract Concepts
Jiayi Liao
Xu Chen
Qiang Fu
Lun Du
Xiangnan He
Xiang Wang
Shi Han
Dongmei Zhang
107
16
0
26 Sep 2023
ITI-GEN: Inclusive Text-to-Image Generation
ITI-GEN: Inclusive Text-to-Image Generation
Cheng Zhang
Xuanbai Chen
Siqi Chai
Chen Henry Wu
Dmitry Lagun
Thabo Beeler
Fernando de la Torre
VLM
122
58
0
11 Sep 2023
Qwen-VL: A Versatile Vision-Language Model for Understanding,
  Localization, Text Reading, and Beyond
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
Jinze Bai
Shuai Bai
Shusheng Yang
Shijie Wang
Sinan Tan
Peng Wang
Junyang Lin
Chang Zhou
Jingren Zhou
MLLMVLMObjD
189
945
0
24 Aug 2023
UniDoc: A Universal Large Multimodal Model for Simultaneous Text
  Detection, Recognition, Spotting and Understanding
UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding
Hao Feng
Zijian Wang
Jingqun Tang
Jinghui Lu
Wen-gang Zhou
Houqiang Li
Can Huang
MLLMVLM
136
51
0
19 Aug 2023
Masked-Attention Diffusion Guidance for Spatially Controlling
  Text-to-Image Generation
Masked-Attention Diffusion Guidance for Spatially Controlling Text-to-Image Generation
Yuki Endo
68
8
0
11 Aug 2023
LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image
  Generation
LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Leigang Qu
Shengqiong Wu
Hao Fei
Liqiang Nie
Tat-Seng Chua
LM&RoDiffMMLLM
143
100
0
09 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
495
12,124
0
18 Jul 2023
Human Preference Score v2: A Solid Benchmark for Evaluating Human
  Preferences of Text-to-Image Synthesis
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Xiaoshi Wu
Yiming Hao
Keqiang Sun
Yixiong Chen
Feng Zhu
Rui Zhao
Hongsheng Li
138
316
0
15 Jun 2023
Visual Programming for Text-to-Image Generation and Evaluation
Visual Programming for Text-to-Image Generation and Evaluation
Jaemin Cho
Abhaysinh Zala
Joey Tianyi Zhou
MLLM
117
51
0
24 May 2023
SCRNet: a Retinex Structure-based Low-light Enhancement Model Guided by
  Spatial Consistency
SCRNet: a Retinex Structure-based Low-light Enhancement Model Guided by Spatial Consistency
Miaohui Zhang
Yiqing Shen
Shenghui Zhong
89
8
0
14 May 2023
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image
  Generation
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation
Yuval Kirstain
Adam Polyak
Uriel Singer
Shahbuland Matiana
Joe Penna
Omer Levy
EGVM
233
420
0
02 May 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image
  Generation
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
159
413
0
12 Apr 2023
12
Next