Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.14465
Cited By
FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection
22 December 2023
Dongmei Zhang
Chang Li
Ray Zhang
Shenghao Xie
Wei Xue
Xiaodong Xie
Shanghang Zhang
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (7★)
Papers citing
"FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection"
20 / 20 papers shown
Title
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
Zhongyu Xia
Jishuo Li
Zhiwei Lin
Xinhao Wang
Yansen Wang
Ming-Hsuan Yang
VLM
127
3
0
26 Nov 2024
Embedded Visual Prompt Tuning
Wenqiang Zu
Shenghao Xie
Qing Zhao
Guoqi Li
Lei Ma
VLM
MedIm
121
10
0
01 Jul 2024
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following
Ziyu Guo
Renrui Zhang
Xiangyang Zhu
Yiwen Tang
Xianzheng Ma
...
Ke Chen
Peng Gao
Xianzhi Li
Hongsheng Li
Pheng-Ann Heng
MLLM
82
145
0
01 Sep 2023
Personalize Segment Anything Model with One Shot
Renrui Zhang
Zhengkai Jiang
Ziyu Guo
Shilin Yan
Junting Pan
Xianzheng Ma
Hao Dong
Peng Gao
Hongsheng Li
MLLM
VLM
102
219
0
04 May 2023
When SAM Meets Medical Images: An Investigation of Segment Anything Model (SAM) on Multi-phase Liver Tumor Segmentation
Chuanfei Hu
Tianyi Xia
Shenghong Ju
Xinde Li
MedIm
VLM
63
75
0
17 Apr 2023
SAMM (Segment Any Medical Model): A 3D Slicer Integration to SAM
Yihao Liu
Jiaming Zhang
Zhangcong She
Amir Kheradmand
Mehran Armand
VLM
64
40
0
12 Apr 2023
Open-Vocabulary Point-Cloud Object Detection without 3D Annotation
Yuheng Lu
Chenfeng Xu
Xi Wei
Xiaodong Xie
Masayoshi Tomizuka
Kurt Keutzer
Shanghang Zhang
3DPC
79
56
0
03 Apr 2023
Unified Text Structuralization with Instruction-tuned Language Models
Xuanfan Ni
Piji Li
Huayang Li
77
13
0
27 Mar 2023
Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
Renrui Zhang
Liuhui Wang
Yu Qiao
Peng Gao
Hongsheng Li
3DPC
79
134
0
13 Dec 2022
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning
Xiangyang Zhu
Renrui Zhang
Bowei He
Ziyu Guo
Ziyao Zeng
Zipeng Qin
Shanghang Zhang
Peng Gao
VLM
70
145
0
21 Nov 2022
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang
Bichen Wu
Xiaoliang Dai
Kunpeng Li
Yinan Zhao
Hang Zhang
Peizhao Zhang
Peter Vajda
Diana Marculescu
CLIP
VLM
102
457
0
09 Oct 2022
Detecting Twenty-thousand Classes using Image-level Supervision
Xingyi Zhou
Rohit Girdhar
Armand Joulin
Phillip Krahenbuhl
Ishan Misra
CLIP
VLM
106
617
0
07 Jan 2022
An End-to-End Transformer Model for 3D Object Detection
Ishan Misra
Rohit Girdhar
Armand Joulin
3DPC
ViT
99
486
0
16 Sep 2021
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
Chenfeng Xu
Shijia Yang
Tomer Galanti
Bichen Wu
Xiangyu Yue
Bohan Zhai
Wei Zhan
Peter Vajda
Kurt Keutzer
Masayoshi Tomizuka
3DPC
46
54
0
08 Jun 2021
Open-Vocabulary Object Detection Using Captions
Alireza Zareian
Kevin Dela Rosa
Derek Hao Hu
Shih-Fu Chang
VLM
ObjD
130
433
0
20 Nov 2020
H3DNet: 3D Object Detection Using Hybrid Geometric Primitives
Zaiwei Zhang
Bo Sun
Haitao Yang
Qi-Xing Huang
3DPC
80
200
0
10 Jun 2020
LVIS: A Dataset for Large Vocabulary Instance Segmentation
Agrim Gupta
Piotr Dollár
Ross B. Girshick
ISeg
VLM
105
1,379
0
08 Aug 2019
Deep Hough Voting for 3D Object Detection in Point Clouds
C. Qi
Or Litany
Kaiming He
Leonidas Guibas
3DPC
108
1,290
0
21 Apr 2019
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Angela Dai
Angel X. Chang
Manolis Savva
Maciej Halber
Thomas Funkhouser
Matthias Nießner
3DPC
3DV
492
4,081
0
14 Feb 2017
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.7K
39,595
0
01 Sep 2014
1