Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.18203
Cited By
v1
v2
v3
v4
v5 (latest)
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
27 November 2024
Di Zhang
Jingdi Lei
Junxian Li
Xunzhi Wang
Yong Liu
Zonglin Yang
Jiatong Li
Weida Wang
Steve Yang
Jianbo Wu
Peng Ye
Wanli Ouyang
Dongzhan Zhou
OffRL
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning"
50 / 63 papers shown
Title
Sherlock: Self-Correcting Reasoning in Vision-Language Models
Yi Ding
Ruqi Zhang
ReLM
LRM
VLM
101
0
0
28 May 2025
VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models
Jiacheng Ruan
Wenzhen Yuan
Xian Gao
Ye Guo
Daoxin Zhang
Zhe Xu
Yao Hu
Ting Liu
Yuzhuo Fu
LRM
VLM
162
6
0
10 Mar 2025
SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner
Kejia Chen
Jiawen Zhang
Jiacong Hu
Jiazhen Yang
Jian Lou
Zunlei Feng
Mingli Song
127
0
0
06 Mar 2025
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
L. Yang
Xinchen Zhang
Ye Tian
Chenming Shang
Minghao Xu
Wentao Zhang
Tengjiao Wang
142
4
0
17 Feb 2025
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Yuhang Zang
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Ziyu Liu
...
Haodong Duan
Wentao Zhang
Kai Chen
Dahua Lin
Jiaqi Wang
VLM
217
25
0
21 Jan 2025
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Ruilin Luo
Zhuofan Zheng
Yifan Wang
Xinzhe Ni
Zicheng Lin
...
Yiyao Yu
C. Shi
Ruihang Chu
Jin Zeng
Yujiu Yang
LRM
174
25
0
08 Jan 2025
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Mingyang Song
Zhaochen Su
Xiaoye Qu
Jiawei Zhou
Yu Cheng
LRM
136
40
0
06 Jan 2025
Rule Based Rewards for Language Model Safety
Tong Mu
Alec Helyar
Johannes Heidecke
Joshua Achiam
Andrea Vallone
Ian Kivlichan
Molly Lin
Alex Beutel
John Schulman
Lilian Weng
ALM
109
48
0
02 Nov 2024
GPT-4o System Card
OpenAI OpenAI
:
Aaron Hurst
Adam Lerer
Adam P. Goucher
...
Yuchen He
Yuchen Zhang
Yujia Jin
Yunxing Dai
Yury Malkov
MLLM
232
1,038
0
25 Oct 2024
Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks
Jiayi He
Hehai Lin
Q. Wang
Yi R. Fung
Chenhui Xu
ReLM
LRM
201
7
0
05 Oct 2024
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning
Di Zhang
Jianbo Wu
Jingdi Lei
Tong Che
Jiatong Li
...
Shufei Zhang
Marco Pavone
Yuqiang Li
Wanli Ouyang
Dongzhan Zhou
LRM
84
61
0
03 Oct 2024
CAST: Cross-modal Alignment Similarity Test for Vision Language Models
Gautier Dagan
Olga Loginova
Anil Batra
CoGe
130
1
0
17 Sep 2024
MiniCPM-V: A GPT-4V Level MLLM on Your Phone
Yuan Yao
Tianyu Yu
Ao Zhang
Chongyi Wang
Junbo Cui
...
Xu Han
Guoyang Zeng
Dahai Li
Zhiyuan Liu
Maosong Sun
VLM
MLLM
122
478
0
03 Aug 2024
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Pan Zhang
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Rui Qian
...
Kai Chen
Jifeng Dai
Yu Qiao
Dahua Lin
Jiaqi Wang
130
117
0
03 Jul 2024
LLM Critics Help Catch LLM Bugs
Nat McAleese
Rai Michael Pokorny
Juan Felipe Cerón Uribe
Evgenia Nitishinskaya
Maja Trebacz
Jan Leike
ALM
LRM
80
82
0
28 Jun 2024
VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-Tuning
Ziyang Meng
Yu Dai
Zezheng Gong
Shaoxiong Guo
Minglong Tang
Tongquan Wei
VLM
75
3
0
20 Jun 2024
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Team GLM
:
Aohan Zeng
Bin Xu
Bowen Wang
...
Zhaoyu Wang
Zhen Yang
Zhengxiao Du
Zhenyu Hou
Zihan Wang
ALM
134
647
0
18 Jun 2024
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
Yongting Zhang
Lu Chen
Guodong Zheng
Yifeng Gao
Rui Zheng
...
Yu Qiao
Xuanjing Huang
Feng Zhao
Tao Gui
Jing Shao
VLM
180
33
0
17 Jun 2024
TextGrad: Automatic "Differentiation" via Text
Mert Yuksekgonul
Federico Bianchi
Joseph Boen
Sheng Liu
Zhi Huang
Carlos Guestrin
James Zou
LLMAG
OOD
AI4CE
95
48
0
11 Jun 2024
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B
Di Zhang
Xiaoshui Huang
Dongzhan Zhou
Yuqiang Li
Wanli Ouyang
LRM
102
78
0
11 Jun 2024
Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models
Sangmin Woo
Donguk Kim
Jaehyuk Jang
Yubin Choi
Changick Kim
96
15
0
28 May 2024
Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement
Xiyao Wang
Jiuhai Chen
Zhaoyang Wang
Yuhang Zhou
Yiyang Zhou
...
Dinesh Manocha
Tom Goldstein
Parminder Bhatia
Furong Huang
Cao Xiao
179
38
0
24 May 2024
Calibrated Self-Rewarding Vision Language Models
Yiyang Zhou
Zhiyuan Fan
Dongjie Cheng
Sihan Yang
Zhaorun Chen
Chenhang Cui
Xiyao Wang
Yun Li
Linjun Zhang
Huaxiu Yao
VLM
128
34
0
23 May 2024
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Kaining Ying
Fanqing Meng
Jin Wang
Zhiqiang Li
Han Lin
...
Yali Wang
Yuning Qiao
Ping Luo
Kaipeng Zhang
Wenqi Shao
84
99
0
24 Apr 2024
Are We on the Right Way for Evaluating Large Vision-Language Models?
Lin Chen
Jinsong Li
Xiao-wen Dong
Pan Zhang
Yuhang Zang
...
Haodong Duan
Jiaqi Wang
Yu Qiao
Dahua Lin
Feng Zhao
VLM
137
302
0
29 Mar 2024
Learning From Correctness Without Prompting Makes LLM Efficient Reasoner
Yuxuan Yao
Han Wu
Zhijiang Guo
Biyan Zhou
Jiahui Gao
Sichun Luo
Hanxu Hou
Xiaojin Fu
Linqi Song
LLMAG
LRM
123
10
0
28 Mar 2024
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang
Dongzhi Jiang
Yichi Zhang
Haokun Lin
Ziyu Guo
...
Aojun Zhou
Pan Lu
Kai-Wei Chang
Peng Gao
Hongsheng Li
86
253
0
21 Mar 2024
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Haoyu Lu
Wen Liu
Bo Zhang
Bing-Li Wang
Kai Dong
...
Yaofeng Sun
Chengqi Deng
Hanwei Xu
Zhenda Xie
Chong Ruan
VLM
107
373
0
08 Mar 2024
Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
Yiyang Zhou
Chenhang Cui
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
VLM
MLLM
97
122
0
18 Feb 2024
V-STaR: Training Verifiers for Self-Taught Reasoners
Arian Hosseini
Xingdi Yuan
Nikolay Malkin
Rameswar Panda
Alessandro Sordoni
Rishabh Agarwal
ReLM
LRM
95
137
0
09 Feb 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Conghui He
Xingcheng Zhang
Yu Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
152
268
0
29 Jan 2024
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
262
1,216
0
21 Dec 2023
Silkie: Preference Distillation for Large Visual Language Models
Lei Li
Zhihui Xie
Mukai Li
Shunian Chen
Peiyi Wang
Liang Chen
Yazheng Yang
Benyou Wang
Lingpeng Kong
MLLM
168
80
0
17 Dec 2023
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Qidong Huang
Xiao-wen Dong
Pan Zhang
Bin Wang
Conghui He
Jiaqi Wang
Dahua Lin
Weiming Zhang
Neng H. Yu
MLLM
128
206
0
29 Nov 2023
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
Lin Chen
Jinsong Li
Xiao-wen Dong
Pan Zhang
Conghui He
Jiaqi Wang
Feng Zhao
Dahua Lin
MLLM
VLM
200
682
0
21 Nov 2023
LLMs cannot find reasoning errors, but can correct them given the error location
Gladys Tyen
Hassan Mansoor
Victor Carbune
Peter Chen
Tony Mak
LRM
128
79
0
14 Nov 2023
Improved Baselines with Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Yuheng Li
Yong Jae Lee
VLM
MLLM
181
2,826
0
05 Oct 2023
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
Pan Lu
Hritik Bansal
Tony Xia
Jiacheng Liu
Chun-yue Li
Hannaneh Hajishirzi
Hao Cheng
Kai-Wei Chang
Michel Galley
Jianfeng Gao
LRM
MLLM
128
665
0
03 Oct 2023
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
Zhengyuan Yang
Linjie Li
Kevin Qinghong Lin
Jianfeng Wang
Chung-Ching Lin
Nasim Shakouri Mahmoudabadi
Lijuan Wang
LM&MA
84
646
0
29 Sep 2023
Aligning Large Multimodal Models with Factually Augmented RLHF
Zhiqing Sun
Sheng Shen
Shengcao Cao
Haotian Liu
Chunyuan Li
...
Liangyan Gui
Yu-Xiong Wang
Yiming Yang
Kurt Keutzer
Trevor Darrell
VLM
127
395
0
25 Sep 2023
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Anas Awadalla
Irena Gao
Josh Gardner
Jack Hessel
Yusuf Hanafy
...
Simon Kornblith
Pang Wei Koh
Gabriel Ilharco
Mitchell Wortsman
Ludwig Schmidt
MLLM
127
433
0
02 Aug 2023
SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension
Bohao Li
Rui Wang
Guangzhi Wang
Yuying Ge
Yixiao Ge
Ying Shan
MLLM
ELM
126
572
0
30 Jul 2023
MMBench: Is Your Multi-modal Model an All-around Player?
Yuanzhan Liu
Haodong Duan
Yuanhan Zhang
Yue Liu
Songyang Zhang
...
Jiaqi Wang
Conghui He
Ziwei Liu
Kai-xiang Chen
Dahua Lin
133
1,059
0
12 Jul 2023
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
Fuxiao Liu
Kevin Qinghong Lin
Linjie Li
Jianfeng Wang
Yaser Yacoob
Lijuan Wang
VLM
MLLM
140
286
0
26 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
389
4,169
0
29 May 2023
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Shunyu Yao
Dian Yu
Jeffrey Zhao
Izhak Shafran
Thomas Griffiths
Yuan Cao
Karthik Narasimhan
LM&Ro
LRM
AI4CE
180
2,037
0
17 May 2023
Evaluating Object Hallucination in Large Vision-Language Models
Yifan Li
Yifan Du
Kun Zhou
Jinpeng Wang
Wayne Xin Zhao
Ji-Rong Wen
MLLM
LRM
309
814
0
17 May 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLM
MLLM
165
2,072
0
20 Apr 2023
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
571
4,925
0
17 Apr 2023
REFINER: Reasoning Feedback on Intermediate Representations
Debjit Paul
Mete Ismayilzada
Maxime Peyrard
Beatriz Borges
Antoine Bosselut
Robert West
Boi Faltings
ReLM
LRM
126
182
0
04 Apr 2023
1
2
Next