Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.08919
Cited By
CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with Ground Truth Flow
13 March 2024
Chenbin Pan
Burhaneddin Yaman
Senem Velipasalar
Liu Ren
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with Ground Truth Flow"
13 / 13 papers shown
Title
CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections
Mohamed Fazli Mohamed Imam
Rufael Fedaku Marew
Jameel Hassan
M. Fiaz
Alham Fikri Aji
Hisham Cholakkal
VLM
187
0
0
28 Nov 2024
Unveiling the Black Box: Independent Functional Module Evaluation for Bird's-Eye-View Perception Model
Ludan Zhang
Xiaokang Ding
Yuqi Dai
Lei He
Keqiang Li
27
0
0
18 Sep 2024
MaskBEV: Towards A Unified Framework for BEV Detection and Map Segmentation
Xiao Zhao
Xukun Zhang
Dingkang Yang
Mingyang Sun
Mingcheng Li
Shunli Wang
Lihua Zhang
MoE
42
1
0
17 Aug 2024
MonoDETRNext: Next-generation Accurate and Efficient Monocular 3D Object Detection Method
Pan Liao
Feng Yang
Di Wu
Liu Bo
34
0
0
24 May 2024
Feature Map Convergence Evaluation for Functional Module
Ludan Zhang
Chaoyi Chen
Lei He
Keqiang Li
35
2
0
07 May 2024
VAD: Vectorized Scene Representation for Efficient Autonomous Driving
Bo Jiang
Shaoyu Chen
Qing Xu
Bencheng Liao
Jiajie Chen
Helong Zhou
Qian Zhang
Wenyu Liu
Chang Huang
Xinggang Wang
110
194
0
21 Mar 2023
TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving
Kashyap Chitta
Aditya Prakash
Bernhard Jaeger
Zehao Yu
Katrin Renz
Andreas Geiger
ViT
104
295
0
31 May 2022
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
192
385
0
06 Nov 2021
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
152
362
0
17 Sep 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
345
2,271
0
02 Sep 2021
FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras
Anthony Hu
Zak Murez
Nikhil C. Mohan
Sofía Dudas
Jeffrey Hawke
Vijay Badrinarayanan
R. Cipolla
Alex Kendall
142
254
0
21 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
304
3,708
0
11 Feb 2021
Feature Pyramid Networks for Object Detection
Nayeon Lee
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
183
21,819
0
09 Dec 2016
1