Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.11411
Cited By
Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
18 February 2024
Yiyang Zhou
Chenhang Cui
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
VLM
MLLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (85★)
Papers citing
"Aligning Modalities in Vision Large Language Models via Preference Fine-tuning"
50 / 85 papers shown
Title
VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning
Zhangyang Qi
Zhixiong Zhang
Yizhou Yu
Jiaqi Wang
Hengshuang Zhao
LM&Ro
AI4TS
51
0
0
20 Jun 2025
Auto-Connect: Connectivity-Preserving RigFormer with Direct Preference Optimization
Jingfeng Guo
Jian Liu
Jinnan Chen
Shiwei Mao
Changrong Hu
...
Jing Xu
Qi Liu
Lixin Xu
Zhuo Chen
Chunchao Guo
38
0
0
13 Jun 2025
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
Xiyao Wang
Zhengyuan Yang
Chao Feng
Yongyuan Liang
Yuhang Zhou
...
Chung-Ching Lin
Kevin Lin
Linjie Li
Furong Huang
L. xilinx Wang
OffRL
LRM
64
0
0
11 Jun 2025
Mitigating Hallucinations in Large Vision-Language Models via Entity-Centric Multimodal Preference Optimization
Jiulong Wu
Zhengliang Shi
Shuaiqiang Wang
J. Huang
Dawei Yin
Lingyong Yan
Min Cao
Min Zhang
MLLM
77
0
0
04 Jun 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng
Caroline Chan
F. Durand
Phillip Isola
EGVM
36
0
0
02 Jun 2025
HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language Models
Songtao Jiang
Yan Zhang
Yeying Jin
Zhihang Tang
Y. Wu
Yang Feng
Jian Wu
Zuozhu Liu
52
1
0
01 Jun 2025
Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning
Minheng Ni
Zhengyuan Yang
Linjie Li
Chung-Ching Lin
Kevin Qinghong Lin
W. Zuo
Lijuan Wang
ReLM
LRM
87
1
0
26 May 2025
Improving Medical Reasoning with Curriculum-Aware Reinforcement Learning
Shaohao Rui
Kaitao Chen
Weijie Ma
Xiaosong Wang
OffRL
LRM
27
0
0
25 May 2025
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains
C. Wang
Xiaoran Pan
Zihao Pan
Haofan Wang
Yiren Song
LRM
152
0
0
24 May 2025
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Jingjing Jiang
Chongjie Si
Jun Luo
Hanwang Zhang
Chao Ma
196
0
0
23 May 2025
OViP: Online Vision-Language Preference Learning
Shujun Liu
Siyuan Wang
Zejun Li
Jianxiang Wang
Cheng Zeng
Zhongyu Wei
MLLM
VLM
76
0
0
21 May 2025
Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language Models
Xinlong Chen
Yuanxing Zhang
Qiang Liu
Junfei Wu
Fuzheng Zhang
Tieniu Tan
MLLM
134
0
0
17 May 2025
Critique Before Thinking: Mitigating Hallucination through Rationale-Augmented Instruction Tuning
Zexian Yang
Dian Li
Dayan Wu
Gang Liu
Weiping Wang
MLLM
LRM
102
0
0
12 May 2025
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Yibin Wang
Zhimin Li
Yuhang Zang
Chunyu Wang
Qinglin Lu
Cheng Jin
Jinqiao Wang
LRM
147
11
0
06 May 2025
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
Yi-Fan Zhang
Xingyu Lu
X. Hu
Chaoyou Fu
Bin Wen
...
Jianfei Chen
Fan Yang
Zheng Zhang
Yan Li
Liang Wang
OffRL
LRM
135
6
0
05 May 2025
Sailing by the Stars: A Survey on Reward Models and Learning Strategies for Learning from Rewards
Xiaobao Wu
LRM
240
5
0
05 May 2025
A Comprehensive Analysis for Visual Object Hallucination in Large Vision-Language Models
Liqiang Jing
Guiming Hardy Chen
Ehsan Aghazadeh
Xin Eric Wang
Xinya Du
139
0
0
04 May 2025
Anyprefer: An Agentic Framework for Preference Data Synthesis
Yiyang Zhou
Zhaoxiang Wang
Tianle Wang
Shangyu Xing
Peng Xia
...
Chetan Bansal
Weitong Zhang
Ying Wei
Joey Tianyi Zhou
Huaxiu Yao
157
2
0
27 Apr 2025
AdaViP: Aligning Multi-modal LLMs via Adaptive Vision-enhanced Preference Optimization
Jinda Lu
Jinghan Li
Yuan Gao
Junkang Wu
Jiancan Wu
Xiang Wang
Xiangnan He
424
1
0
22 Apr 2025
VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
Haojian Huang
Haodong Chen
Shengqiong Wu
Meng Luo
Jinlan Fu
Xinya Du
Hao Zhang
Hao Fei
AI4TS
474
2
0
17 Apr 2025
Self-alignment of Large Video Language Models with Refined Regularized Preference Optimization
Pritam Sarkar
Ali Etemad
98
0
0
16 Apr 2025
Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Wei Chen
Xin Yan
Bin Wen
Fan Yang
Yan Li
Di Zhang
Long Chen
MLLM
189
0
0
09 Apr 2025
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Samarth Mishra
Kate Saenko
Venkatesh Saligrama
CoGe
LRM
75
0
0
07 Apr 2025
Enhancing Chart-to-Code Generation in Multimodal Large Language Models via Iterative Dual Preference Learning
Zhihan Zhang
Yixin Cao
Lizi Liao
103
2
0
03 Apr 2025
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs
Zitian Wang
Yue Liao
Kang Rong
Fengyun Rao
Yibo Yang
Si Liu
118
0
0
26 Mar 2025
Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization
Zefeng Zhang
Hengzhu Tang
Shuaiyi Nie
Zhenyu Zhang
Yiming Ren
Zhenyang Li
Dawei Yin
Duohe Ma
Tingwen Liu
117
1
0
23 Mar 2025
Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations
Shuo Li
Jiajun Sun
Guodong Zheng
Xiaoran Fan
Yujiong Shen
...
Wenming Tan
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
AAML
VLM
195
1
0
19 Mar 2025
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
R. Zhao
Junliang Ye
Ziyi Wang
Guangce Liu
Yiwen Chen
Yikai Wang
Jun Zhu
AI4CE
95
4
0
19 Mar 2025
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
S. Han
Peng Xia
Ruiyi Zhang
Tong Sun
Yun Li
Hongtu Zhu
Huaxiu Yao
VLM
186
8
0
18 Mar 2025
Aligning Multimodal LLM with Human Preference: A Survey
Tao Yu
Yize Zhang
Chaoyou Fu
Junkang Wu
Jinda Lu
...
Qingsong Wen
Zheng Zhang
Yan Huang
Liang Wang
Tieniu Tan
441
4
0
18 Mar 2025
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning
Qinghao Ye
Xianhan Zeng
Fu Li
Chong Li
Haoqi Fan
CoGe
116
5
0
10 Mar 2025
Utilizing Jailbreak Probability to Attack and Safeguard Multimodal LLMs
Wenzhuo Xu
Zhipeng Wei
Xiongtao Sun
Deyue Zhang
Dongdong Yang
Quanchen Zou
Xinming Zhang
AAML
92
0
0
10 Mar 2025
An Optimization Algorithm for Multimodal Data Alignment
Wei Zhang
Xinyu Wang
Lan Yu
S. Li
71
0
0
05 Mar 2025
Visual-RFT: Visual Reinforcement Fine-Tuning
Ziyu Liu
Zeyi Sun
Yuhang Zang
Xiaoyi Dong
Yuhang Cao
Haodong Duan
Dahua Lin
Jiaqi Wang
ObjD
VLM
LRM
157
129
0
03 Mar 2025
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
Wei Suo
Lijun Zhang
Mengyang Sun
Lin Yuanbo Wu
Peng Wang
Yize Zhang
MLLM
VLM
108
3
0
01 Mar 2025
Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images
Shengguang Wu
Fan-Yun Sun
Kaiyue Wen
Nick Haber
VLM
169
3
0
19 Feb 2025
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Shuo Xing
Yuping Wang
Peiran Li
Ruizheng Bai
Yansen Wang
Chan-wei Hu
Chengxuan Qian
Huaxiu Yao
Zhengzhong Tu
187
8
0
18 Feb 2025
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
L. Yang
Xinchen Zhang
Ye Tian
Chenming Shang
Minghao Xu
Wentao Zhang
Tengjiao Wang
147
4
0
17 Feb 2025
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Yuhang Zang
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Ziyu Liu
...
Haodong Duan
Wentao Zhang
Kai Chen
Dahua Lin
Jiaqi Wang
VLM
259
25
0
21 Jan 2025
Multimodal Preference Data Synthetic Alignment with Reward Model
Robert Wijaya
Ngoc-Bao Nguyen
Ngai-Man Cheung
MLLM
SyDa
133
4
0
23 Dec 2024
Leveraging Retrieval-Augmented Tags for Large Vision-Language Understanding in Complex Scenes
Antonio Carlos Rivera
Anthony Moore
Steven Robinson
VLM
LRM
135
0
0
16 Dec 2024
MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization
Kangyu Zhu
Peng Xia
Yun Li
Hongtu Zhu
Sheng Wang
Huaxiu Yao
210
3
0
09 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
Shijie Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
170
22
0
03 Dec 2024
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Di Zhang
Jingdi Lei
Junxian Li
Xunzhi Wang
Yong Liu
...
Steve Yang
Jianbo Wu
Peng Ye
Wanli Ouyang
Dongzhan Zhou
OffRL
LRM
192
8
0
27 Nov 2024
Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach
Shijian Deng
Wentian Zhao
Yu-Jhe Li
Kun Wan
Daniel Miranda
Ajinkya Kale
Yapeng Tian
LRM
169
6
0
26 Nov 2024
VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models
Lei Li
Y. X. Wei
Zhihui Xie
Xuqing Yang
Yifan Song
...
Tianyu Liu
Sujian Li
Bill Yuchen Lin
Dianbo Sui
Qiang Liu
VLM
CoGe
196
32
0
26 Nov 2024
Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
Lehan He
Zeren Chen
Zhelun Shi
Tianyu Yu
Jing Shao
Lu Sheng
MLLM
227
2
0
26 Nov 2024
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization
Yuhan Fu
Ruobing Xie
Xingwu Sun
Zhanhui Kang
Xirong Li
MLLM
94
5
0
15 Nov 2024
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Weiyun Wang
Zhe Chen
Wenhai Wang
Yue Cao
Yangzhou Liu
...
Jinguo Zhu
X. Zhu
Lewei Lu
Yu Qiao
Jifeng Dai
LRM
145
93
1
15 Nov 2024
V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization
Yuxi Xie
Guanzhen Li
Xiao Xu
Min-Yen Kan
MLLM
VLM
112
24
0
05 Nov 2024
1
2
Next