ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.02982
  4. Cited By
Beyond First Impressions: Integrating Joint Multi-modal Cues for
  Comprehensive 3D Representation

Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation

6 August 2023
Haowei Wang
Jiji Tang
Jiayi Ji
Xiaoshuai Sun
Rongsheng Zhang
Yiwei Ma
Minda Zhao
Lincheng Li
zeng zhao
Tangjie Lv
Rongrong Ji
    3DV
ArXivPDFHTML

Papers citing "Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation"

16 / 16 papers shown
Title
PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models
Zilu Guo
Hongbin Lin
Zhihao Yuan
C. Zheng
Pengshuo Qiu
Dongzhi Jiang
Renrui Zhang
Chun-Mei Feng
Zhen Li
MLLM
3DV
129
1
0
13 Mar 2025
Image Captioning via Dynamic Path Customization
Image Captioning via Dynamic Path Customization
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Xiaopeng Hong
Yongjian Wu
Rongrong Ji
45
0
0
01 Jun 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks
  via Multi-modal Large Language Models
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
42
13
0
16 May 2024
GPT4Point: A Unified Framework for Point-Language Understanding and
  Generation
GPT4Point: A Unified Framework for Point-Language Understanding and Generation
Zhangyang Qi
Ye Fang
Zeyi Sun
Xiaoyang Wu
Tong Wu
Jiaqi Wang
Dahua Lin
Hengshuang Zhao
MLLM
92
36
0
05 Dec 2023
MV-CLIP: Multi-View CLIP for Zero-shot 3D Shape Recognition
MV-CLIP: Multi-View CLIP for Zero-shot 3D Shape Recognition
Dan Song
Xinwei Fu
Weizhi Nie
Wenhui Li
Lanjun Wang
You Yang
Anan Liu
VLM
41
6
0
30 Nov 2023
Sculpting Holistic 3D Representation in Contrastive Language-Image-3D
  Pre-training
Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training
Yipeng Gao
Zeyu Wang
Wei-Shi Zheng
Cihang Xie
Yuyin Zhou
3DPC
39
9
0
03 Nov 2023
JM3D & JM3D-LLM: Elevating 3D Understanding with Joint Multi-modal Cues
JM3D & JM3D-LLM: Elevating 3D Understanding with Joint Multi-modal Cues
Jiayi Ji
Haowei Wang
Changli Wu
Yiwei Ma
Xiaoshuai Sun
Rongrong Ji
83
1
0
14 Oct 2023
PointLLM: Empowering Large Language Models to Understand Point Clouds
PointLLM: Empowering Large Language Models to Understand Point Clouds
Runsen Xu
Xiaolong Wang
Tai Wang
Yilun Chen
Jiangmiao Pang
Dahua Lin
MLLM
61
157
0
31 Aug 2023
Let Images Give You More:Point Cloud Cross-Modal Training for Shape
  Analysis
Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis
Xu Yan
Heshen Zhan
Chaoda Zheng
Jiantao Gao
Ruimao Zhang
Shuguang Cui
Zhen Li
3DPC
59
33
0
09 Oct 2022
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud
  Pre-training
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Renrui Zhang
Ziyu Guo
Rongyao Fang
Bingyan Zhao
Dong Wang
Yu Qiao
Hongsheng Li
Peng Gao
3DPC
184
247
0
28 May 2022
PointCLIP: Point Cloud Understanding by CLIP
PointCLIP: Point Cloud Understanding by CLIP
Renrui Zhang
Ziyu Guo
Wei Zhang
Kunchang Li
Xupeng Miao
Tengjiao Wang
Yu Qiao
Peng Gao
Hongsheng Li
VLM
3DPC
177
440
0
04 Dec 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
264
562
0
28 Sep 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
315
2,623
0
04 May 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip
  Retrieval
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
337
791
0
18 Apr 2021
DensePoint: Learning Densely Contextual Representation for Efficient
  Point Cloud Processing
DensePoint: Learning Densely Contextual Representation for Efficient Point Cloud Processing
Yongcheng Liu
Bin Fan
Gaofeng Meng
Jiwen Lu
Shiming Xiang
Chunhong Pan
3DPC
128
271
0
09 Sep 2019
PointNet: Deep Learning on Point Sets for 3D Classification and
  Segmentation
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas Guibas
3DH
3DPC
3DV
PINN
261
14,158
0
02 Dec 2016
1