ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.13926
  4. Cited By

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

23 January 2025
Ziyu Guo
Renrui Zhang
Chengzhuo Tong
Zhizheng Zhao
Peng Gao
Hongsheng Li
Pheng-Ann Heng
    MoE
    LRM
ArXivPDFHTML

Papers citing "Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step"

25 / 25 papers shown
Title
R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation
R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation
Kaijie Chen
Zihao Lin
Zhiyang Xu
Ying Shen
Yuguang Yao
Joy Rimchala
Jiaxin Zhang
Lifu Huang
EGVM
LRM
39
0
0
29 May 2025
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos
Tingyu Song
Tongyan Hu
Guo Gan
Yilun Zhao
25
0
0
29 May 2025
Thinking with Generated Images
Thinking with Generated Images
Ethan Chern
Zhulin Hu
Steffi Chern
Siqi Kou
Jiadi Su
Yan Ma
Zhijie Deng
Pengfei Liu
LRM
12
0
0
28 May 2025
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Haoyuan Sun
Jiaqi Wu
Bo Xia
Yifu Luo
Yifei Zhao
Kai Qin
Xufei Lv
Tiantian Zhang
Yongzhe Chang
Xueqian Wang
OffRL
LRM
113
0
0
24 May 2025
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
Shilin Yan
Jiaming Han
Joey Tsai
Hongwei Xue
Rongyao Fang
Lingyi Hong
Ziyu Guo
Ray Zhang
VLM
53
3
0
22 May 2025
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
Chengzhuo Tong
Ziyu Guo
Renrui Zhang
Wenyu Shan
Xinyu Wei
Zhenghao Xing
Hongsheng Li
Pheng-Ann Heng
EGVM
OffRL
LRM
59
0
0
22 May 2025
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Yicheng Xiao
Lin Song
Yukang Chen
Yingmin Luo
Yuxin Chen
Yukang Gan
Wei Huang
Xiu Li
Xiaojuan Qi
Ying Shan
LRM
40
2
0
19 May 2025
Visual Planning: Let's Think Only with Images
Visual Planning: Let's Think Only with Images
Yi Xu
Chengzu Li
Han Zhou
Xingchen Wan
Caiqi Zhang
Anna Korhonen
Ivan Vulić
LM&Ro
LRM
87
0
0
16 May 2025
DanceGRPO: Unleashing GRPO on Visual Generation
DanceGRPO: Unleashing GRPO on Visual Generation
Zeyue Xue
Jie Wu
Yu Gao
Fangyuan Kong
Lingting Zhu
...
Zhiheng Liu
Wei Liu
Qiushan Guo
Weilin Huang
Ping Luo
EGVM
VGen
73
3
0
12 May 2025
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
114
3
0
05 May 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
D. Jiang
Ziyu Guo
Renrui Zhang
Zhuofan Zong
Hao Li
Le Zhuo
Shilin Yan
Pheng-Ann Heng
Haoyang Li
LRM
103
14
0
01 May 2025
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
Le Zhuo
Liangbing Zhao
Sayak Paul
Yue Liao
Renrui Zhang
Yi Xin
Peng Gao
Mohamed Elhoseiny
Haoyang Li
VLM
96
0
0
22 Apr 2025
Early Timestep Zero-Shot Candidate Selection for Instruction-Guided Image Editing
Early Timestep Zero-Shot Candidate Selection for Instruction-Guided Image Editing
Joowon Kim
Ziseok Lee
Donghyeon Cho
Sanghyun Jo
Y. Jung
Kyungsu Kim
Eunho Yang
DiffM
68
0
0
18 Apr 2025
$\texttt{Complex-Edit}$: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark
Complex-Edit\texttt{Complex-Edit}Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark
S. Yang
Mude Hui
Bingchen Zhao
Yuyin Zhou
Nataniel Ruiz
Cihang Xie
CoGe
107
1
0
17 Apr 2025
Can Test-Time Scaling Improve World Foundation Model?
Can Test-Time Scaling Improve World Foundation Model?
Wenyan Cong
Hanqing Zhu
Peihao Wang
Bangya Liu
Dejia Xu
Kevin Wang
David Z. Pan
Yan Wang
Zhiwen Fan
Ziyi Wang
85
0
0
31 Mar 2025
StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion
StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion
Ziyu Guo
Young Yoon Lee
Joseph Liu
Yizhak Ben-Shabat
Victor Zordan
Mubbasir Kapadia
DiffM
VGen
85
0
0
27 Mar 2025
Video-T1: Test-Time Scaling for Video Generation
Video-T1: Test-Time Scaling for Video Generation
Fan Liu
Hanyang Wang
Yimo Cai
Kaiyan Zhang
Xiaohang Zhan
Yueqi Duan
DiffM
VGen
113
5
0
24 Mar 2025
Concept-as-Tree: Synthetic Data is All You Need for VLM Personalization
Concept-as-Tree: Synthetic Data is All You Need for VLM Personalization
Ruichuan An
Kai Zeng
Ming Lu
Sihan Yang
Renrui Zhang
Huitong Ji
Qizhe Zhang
Yihao Luo
Hao Liang
Wentao Zhang
80
0
0
17 Mar 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Yansen Wang
Shengqiong Wu
Yize Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
114
23
0
16 Mar 2025
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Arsh Koneru
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VLM
86
1
0
15 Mar 2025
SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems
Ziyu Guo
Ray Zhang
Hao Chen
Jialin Gao
Dongzhi Jiang
Jiaze Wang
Pheng-Ann Heng
77
4
0
13 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
Shanghang Zhang
98
14
0
13 Mar 2025
ProtTeX: Structure-In-Context Reasoning and Editing of Proteins with Large Language Models
Zicheng Ma
Chuanliu Fan
Zhicong Wang
Zhenyu Chen
Xiaohan Lin
Yongqian Li
Shihao Feng
Jun Zhang
Ziqiang Cao
Y. Gao
63
0
0
11 Mar 2025
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Dongzhi Jiang
Renrui Zhang
Ziyu Guo
Yanwei Li
Yu Qi
...
Shen Yan
Bo Zhang
Chaoyou Fu
Peng Gao
Hongsheng Li
MLLM
LRM
106
30
0
13 Feb 2025
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
...
Junlin Xie
Junlin Xie
Yu Qiao
Peng Gao
Hongsheng Li
MLLM
DiffM
105
13
0
23 Sep 2024
1