ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.15966
  4. Cited By
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning

Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning

21 May 2025
Alex Su
Haozhe Wang
Weiming Ren
Fangzhen Lin
Wenhu Chen
    MLLM
    OffRL
    LRM
    VLM
ArXivPDFHTML

Papers citing "Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning"

26 / 26 papers shown
Title
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang
Chao Qu
Zuming Huang
Wei Chu
Fangzhen Lin
Wenhu Chen
OffRL
ReLM
SyDa
LRM
VLM
125
26
0
10 Apr 2025
Video-R1: Reinforcing Video Reasoning in MLLMs
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng
Kaixiong Gong
Yangqiu Song
Zonghao Guo
Yibing Wang
Tianshuo Peng
Jian Wu
Xiaoying Zhang
Benyou Wang
Xiangyu Yue
AI4TS
SyDa
LRM
123
42
0
27 Mar 2025
Understanding R1-Zero-Like Training: A Critical Perspective
Understanding R1-Zero-Like Training: A Critical Perspective
Zichen Liu
Changyu Chen
Wenjun Li
Penghui Qi
Tianyu Pang
Chao Du
Wee Sun Lee
Min Lin
OffRL
LRM
188
137
0
26 Mar 2025
Gemma 3 Technical Report
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
164
100
0
25 Mar 2025
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models
Wenxuan Huang
Bohan Jia
Zijie Zhai
Shaosheng Cao
Zheyu Ye
Fei Zhao
Zhe Xu
Yao Hu
Shaohui Lin
MU
OffRL
LRM
MLLM
ReLM
VLM
130
104
0
09 Mar 2025
Qwen2.5-VL Technical Report
Qwen2.5-VL Technical Report
S. Bai
Keqin Chen
Xuejing Liu
Jialin Wang
Wenbin Ge
...
Zesen Cheng
Hang Zhang
Zhibo Yang
Haiyang Xu
Junyang Lin
VLM
284
528
0
20 Feb 2025
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
Jarvis Guo
Tuney Zheng
Yuelin Bai
Bo Li
Yubo Wang
King Zhu
Yizhi Li
Graham Neubig
Wenhu Chen
Xiang Yue
LRM
137
36
0
06 Dec 2024
GPT-4o System Card
GPT-4o System Card
OpenAI OpenAI
:
Aaron Hurst
Adam Lerer
Adam P. Goucher
...
Yuchen He
Yuchen Zhang
Yujia Jin
Yunxing Dai
Yury Malkov
MLLM
184
895
0
25 Oct 2024
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Shuhao Gu
Jialing Zhang
Siyuan Zhou
Kevin Yu
Zhaohu Xing
...
Yufeng Cui
Xinlong Wang
Yaoqi Liu
Fangxiang Feng
Guang Liu
SyDa
VLM
MLLM
78
26
0
24 Oct 2024
LLaVA-OneVision: Easy Visual Task Transfer
LLaVA-OneVision: Easy Visual Task Transfer
Bo Li
Yuanhan Zhang
Dong Guo
Renrui Zhang
Feng Li
Hao Zhang
Kaichen Zhang
Yanwei Li
Ziwei Liu
Chunyuan Li
MLLM
SyDa
VLM
105
769
0
06 Aug 2024
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal
  Language Models
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
Yushi Hu
Weijia Shi
Xingyu Fu
Dan Roth
Mari Ostendorf
Luke Zettlemoyer
Noah A. Smith
Ranjay Krishna
LRM
79
64
0
13 Jun 2024
Instruction-Guided Visual Masking
Instruction-Guided Visual Masking
Jinliang Zheng
Jianxiong Li
Si Cheng
Yinan Zheng
Jiaming Li
Jihao Liu
Yu Liu
Jingjing Liu
Xianyuan Zhan
117
7
0
30 May 2024
STAR: A Benchmark for Situated Reasoning in Real-World Videos
STAR: A Benchmark for Situated Reasoning in Real-World Videos
Bo Wu
Shoubin Yu
Zhenfang Chen
Joshua B. Tenenbaum
Chuang Gan
108
192
0
15 May 2024
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs
Penghao Wu
Saining Xie
LRM
75
147
0
21 Dec 2023
Visual Program Distillation: Distilling Tools and Programmatic Reasoning
  into Vision-Language Models
Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models
Yushi Hu
Otilia Stretcu
Chun-Ta Lu
Krishnamurthy Viswanathan
Kenji Hata
Enming Luo
Ranjay Krishna
Ariel Fuxman
VLM
LRM
MLLM
67
34
0
05 Dec 2023
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
...
Jilan Xu
Guo Chen
Ping Luo
Limin Wang
Yu Qiao
VLM
MLLM
133
470
0
28 Nov 2023
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning
  Benchmark for Expert AGI
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Xiang Yue
Yuansheng Ni
Kai Zhang
Tianyu Zheng
Ruoqi Liu
...
Yibo Liu
Wenhao Huang
Huan Sun
Yu-Chuan Su
Wenhu Chen
OSLM
ELM
VLM
210
901
0
27 Nov 2023
MathVista: Evaluating Mathematical Reasoning of Foundation Models in
  Visual Contexts
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
Pan Lu
Hritik Bansal
Tony Xia
Jiacheng Liu
Chun-yue Li
Hannaneh Hajishirzi
Hao Cheng
Kai-Wei Chang
Michel Galley
Jianfeng Gao
LRM
MLLM
107
614
0
03 Oct 2023
Adversarial Constrained Bidding via Minimax Regret Optimization with
  Causality-Aware Reinforcement Learning
Adversarial Constrained Bidding via Minimax Regret Optimization with Causality-Aware Reinforcement Learning
Haozhe Jasper Wang
Chao Du
Panyan Fang
Li He
Liangji Wang
Bo Zheng
37
7
0
12 Jun 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
529
4,725
0
17 Apr 2023
ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement
  Learning
ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement Learning
Haozhe Jasper Wang
Chao Du
Panyan Fang
Shuo Yuan
Xu-Jiang He
Liang Wang
Bo Zheng
38
12
0
10 Jun 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
799
9,351
0
28 Jan 2022
Learning Context-aware Task Reasoning for Efficient Meta-reinforcement
  Learning
Learning Context-aware Task Reasoning for Efficient Meta-reinforcement Learning
Haozhe Jasper Wang
Jiale Zhou
Xuming He
OffRL
LRM
39
15
0
03 Mar 2020
TallyQA: Answering Complex Counting Questions
TallyQA: Answering Complex Counting Questions
Manoj Acharya
Kushal Kafle
Christopher Kanan
57
122
0
29 Oct 2018
Constrained Policy Optimization
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
110
1,322
0
30 May 2017
Curiosity-driven Exploration by Self-supervised Prediction
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRM
SSL
106
2,436
0
15 May 2017
1