Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.15714
Cited By
ROOT: VLM based System for Indoor Scene Understanding and Beyond
24 November 2024
Yonghui Wang
Shi-Yong Chen
Zhenxing Zhou
Siyi Li
Haoran Li
Wengang Zhou
Haoyang Li
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ROOT: VLM based System for Indoor Scene Understanding and Beyond"
14 / 14 papers shown
Title
Enhancing Screen Time Identification in Children with a Multi-View Vision Language Model and Screen Time Tracker
Xinlong Hou
Sen Shen
Xueshen Li
Xinran Gao
Ziyi Huang
Steven J. Holiday
Matthew R. Cribbet
Susan W. White
Edward Sazonov
Yu Gan
60
0
0
02 Oct 2024
No More Ambiguity in 360° Room Layout via Bi-Layout Estimation
Yu-Ju Tsai
Jin-Cheng Jhang
Jingjing Zheng
Wei Wang
Albert Y. C. Chen
Min Sun
Cheng-Hao Kuo
Ming-Hsuan Yang
3DV
56
4
0
15 Apr 2024
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Tai Wang
Xiaohan Mao
Chenming Zhu
Runsen Xu
Ruiyuan Lyu
...
Tianfan Xue
Xihui Liu
Cewu Lu
Dahua Lin
Jiangmiao Pang
LM&Ro
59
68
0
26 Dec 2023
PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models
Jiacheng Chen
Ruizhi Deng
Yasutaka Furukawa
DiffM
75
24
0
02 Jun 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
531
13,788
0
15 Mar 2023
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models
Huy Ha
Shuran Song
LM&Ro
VLM
69
103
0
23 Jul 2022
Mutual Scene Synthesis for Mixed Reality Telepresence
Mohammad Keshavarzi
Michael Zollhoefer
An Yang
Patrick Peluse
Luisa Caldas
32
11
0
01 Apr 2022
Human-Aware Object Placement for Visual Environment Reconstruction
Hongwei Yi
C. Huang
Dimitrios Tzionas
Muhammed Kocabas
Mohamed Hassan
Siyu Tang
Justus Thies
Michael J. Black
3DH
54
63
0
07 Mar 2022
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
673
28,659
0
26 Feb 2021
Comprehensive Image Captioning via Scene Graph Decomposition
Yiwu Zhong
Liwei Wang
Jianshu Chen
Dong Yu
Yin Li
97
125
0
23 Jul 2020
Graph-Structured Referring Expression Reasoning in The Wild
Sibei Yang
Guanbin Li
Yizhou Yu
NAI
43
92
0
19 Apr 2020
Seamless Scene Segmentation
Lorenzo Porzi
Samuel Rota Buló
Aleksander Colovic
Peter Kontschieder
SSeg
112
209
0
03 May 2019
AI2-THOR: An Interactive 3D Environment for Visual AI
Eric Kolve
Roozbeh Mottaghi
Winson Han
Eli VanderBilt
Luca Weihs
...
Daniel Gordon
Yuke Zhu
Aniruddha Kembhavi
Abhinav Gupta
Ali Farhadi
LM&Ro
35
1,091
0
14 Dec 2017
The Cityscapes Dataset for Semantic Urban Scene Understanding
Marius Cordts
Mohamed Omran
Sebastian Ramos
Timo Rehfeld
Markus Enzweiler
Rodrigo Benenson
Uwe Franke
Stefan Roth
Bernt Schiele
669
11,540
0
06 Apr 2016
1