Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.18470
Cited By
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
24 March 2025
Zhenyu Pan
Han Liu
OffRL
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse"
15 / 15 papers shown
Title
Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model
Tianle Li
Jihai Zhang
Yongming Rao
Yu Cheng
CoGe
LRM
VLM
79
0
0
26 May 2025
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Haoyuan Sun
Jiaqi Wu
Bo Xia
Yifu Luo
Yifei Zhao
Kai Qin
Xufei Lv
Tiantian Zhang
Yongzhe Chang
Xueqian Wang
OffRL
LRM
204
0
0
24 May 2025
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Bryan S Kim
Jeongsol Kim
Jong Chul Ye
64
0
0
24 May 2025
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
Guanghao Zhou
Panjia Qiu
Chong Chen
Jiadong Wang
Zheming Yang
Jian Xu
Minghui Qiu
OffRL
LRM
184
8
0
30 Apr 2025
GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
Run Luo
Lu Wang
Wanwei He
Xiaobo Xia
LLMAG
174
35
0
14 Apr 2025
AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning
Yangzhe Kong
Daeun Song
Jing Liang
Dinesh Manocha
Ziyu Yao
Xuesu Xiao
LRM
124
1
0
10 Mar 2025
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Zhenyu Pan
Haozheng Luo
Manling Li
Han Liu
LRM
120
17
0
24 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
380
2,000
0
22 Jan 2025
Do Code LLMs Understand Design Patterns?
Zhenyu Pan
Xuefeng Song
Yunkun Wang
Rongyu Cao
Binhua Li
Yongqian Li
Han Liu
69
3
0
10 Jan 2025
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-Xiong Wang
121
23
0
05 Sep 2024
I-Design: Personalized LLM Interior Designer
Ata cCelen
Guo Han
Konrad Schindler
Luc Van Gool
Iro Armeni
Anton Obukhov
Xi Wang
3DV
102
23
0
03 Apr 2024
Ctrl-Room: Controllable Text-to-3D Room Meshes Generation with Layout Constraints
Chuan Fang
Yuan Dong
Kunming Luo
Xiaotao Hu
Rakesh Shrestha
Ping Tan
DiffM
133
37
0
05 Oct 2023
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Weixi Feng
Wanrong Zhu
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
Xuehai He
Sugato Basu
Xinze Wang
William Yang Wang
MLLM
86
179
0
24 May 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
432
4,656
0
30 Jan 2023
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
418
3,610
0
29 Apr 2022
1