ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.09812
  4. Cited By
Structural Information Guided Multimodal Pre-training for
  Vehicle-centric Perception

Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception

15 December 2023
Tianlin Li
Wentao Wu
Chenglong Li
Zhicheng Zhao
Zhe Chen
Yukai Shi
Jin Tang
ArXivPDFHTML

Papers citing "Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception"

14 / 14 papers shown
Title
CM3AE: A Unified RGB Frame and Event-Voxel/-Frame Pre-training Framework
CM3AE: A Unified RGB Frame and Event-Voxel/-Frame Pre-training Framework
Wentao Wu
Xinyu Wang
Chenglong Li
Bo Jiang
Jin Tang
Bin Luo
Qi Liu
62
0
0
17 Apr 2025
Wav2CLIP: Learning Robust Audio Representations From CLIP
Wav2CLIP: Learning Robust Audio Representations From CLIP
Ho-Hsiang Wu
Prem Seetharaman
Kundan Kumar
J. P. Bello
CLIP
VLM
85
268
0
21 Oct 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
298
567
0
28 Sep 2021
All You Can Embed: Natural Language based Vehicle Retrieval with
  Spatio-Temporal Transformers
All You Can Embed: Natural Language based Vehicle Retrieval with Spatio-Temporal Transformers
Carmelo Scribano
D. Sapienza
Giorgia Franchini
M. Verucchi
Marko Bertogna
45
4
0
18 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
184
2,790
0
15 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
170
4,934
0
31 May 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
329
4,873
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
400
3,778
0
11 Feb 2021
TransReID: Transformer-based Object Re-Identification
TransReID: Transformer-based Object Re-Identification
Shuting He
Haowen Luo
Pichao Wang
F. Wang
Hao Li
Wei Jiang
ViT
238
814
0
08 Feb 2021
Dense Contrastive Learning for Self-Supervised Visual Pre-Training
Dense Contrastive Learning for Self-Supervised Visual Pre-Training
Xinlong Wang
Rufeng Zhang
Chunhua Shen
Tao Kong
Lei Li
SSL
65
679
0
18 Nov 2020
Attribute-guided Feature Learning Network for Vehicle Re-identification
Attribute-guided Feature Learning Network for Vehicle Re-identification
Huibing Wang
Jinjia Peng
Dongyan Chen
Guangqi Jiang
Tongtong Zhao
Xianping Fu
43
85
0
12 Jan 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
128
12,007
0
13 Nov 2019
K-BERT: Enabling Language Representation with Knowledge Graph
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
282
785
0
17 Sep 2019
Bag of Tricks and A Strong Baseline for Deep Person Re-identification
Bag of Tricks and A Strong Baseline for Deep Person Re-identification
Hao Luo
Youzhi Gu
Xingyu Liao
Shenqi Lai
Wei Jiang
BDL
3DPC
136
1,170
0
17 Mar 2019
1