ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.18470
  4. Cited By
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse

MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse

24 March 2025
Zhenyu Pan
Han Liu
    OffRLLRM
ArXiv (abs)PDFHTML

Papers citing "MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse"

15 / 15 papers shown
Title
Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model
Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model
Tianle Li
Jihai Zhang
Yongming Rao
Yu Cheng
CoGeLRMVLM
79
0
0
26 May 2025
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Haoyuan Sun
Jiaqi Wu
Bo Xia
Yifu Luo
Yifei Zhao
Kai Qin
Xufei Lv
Tiantian Zhang
Yongzhe Chang
Xueqian Wang
OffRLLRM
204
0
0
24 May 2025
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Bryan S Kim
Jeongsol Kim
Jong Chul Ye
67
0
0
24 May 2025
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
Guanghao Zhou
Panjia Qiu
Chong Chen
Jiadong Wang
Zheming Yang
Jian Xu
Minghui Qiu
OffRLLRM
187
8
0
30 Apr 2025
GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
Run Luo
Lu Wang
Wanwei He
Xiaobo Xia
LLMAG
179
35
0
14 Apr 2025
AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning
Yangzhe Kong
Daeun Song
Jing Liang
Dinesh Manocha
Ziyu Yao
Xuesu Xiao
LRM
124
1
0
10 Mar 2025
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Zhenyu Pan
Haozheng Luo
Manling Li
Han Liu
LRM
120
17
0
24 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLMVLMOffRLAI4TSLRM
380
2,000
0
22 Jan 2025
Do Code LLMs Understand Design Patterns?
Do Code LLMs Understand Design Patterns?
Zhenyu Pan
Xuefeng Song
Yunkun Wang
Rongyu Cao
Binhua Li
Yongqian Li
Han Liu
69
3
0
10 Jan 2025
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-Xiong Wang
121
23
0
05 Sep 2024
I-Design: Personalized LLM Interior Designer
I-Design: Personalized LLM Interior Designer
Ata cCelen
Guo Han
Konrad Schindler
Luc Van Gool
Iro Armeni
Anton Obukhov
Xi Wang
3DV
102
23
0
03 Apr 2024
Ctrl-Room: Controllable Text-to-3D Room Meshes Generation with Layout Constraints
Ctrl-Room: Controllable Text-to-3D Room Meshes Generation with Layout Constraints
Chuan Fang
Yuan Dong
Kunming Luo
Xiaotao Hu
Rakesh Shrestha
Ping Tan
DiffM
133
37
0
05 Oct 2023
LayoutGPT: Compositional Visual Planning and Generation with Large
  Language Models
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Weixi Feng
Wanrong Zhu
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
Xuehai He
Sugato Basu
Xinze Wang
William Yang Wang
MLLM
86
179
0
24 May 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLMMLLM
432
4,656
0
30 Jan 2023
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLMVLM
418
3,610
0
29 Apr 2022
1