ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.07279
  4. Cited By
Scalable 3D Captioning with Pretrained Models

Scalable 3D Captioning with Pretrained Models

12 June 2023
Tiange Luo
C. Rockwell
Honglak Lee
Justin Johnson
ArXivPDFHTML

Papers citing "Scalable 3D Captioning with Pretrained Models"

35 / 35 papers shown
Title
Anymate: A Dataset and Baselines for Learning 3D Object Rigging
Anymate: A Dataset and Baselines for Learning 3D Object Rigging
Yufan Deng
Yuhao Zhang
Chen Geng
Shangzhe Wu
Jiajun Wu
3DH
55
0
0
09 May 2025
PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes
PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes
Ahmed Abdelreheem
Filippo Aleotti
Jamie Watson
Z. Qureshi
Abdelrahman Eldesokey
Peter Wonka
Gabriel J. Brostow
Sara Vicente
Guillermo Garcia-Hernando
DiffM
59
0
0
08 May 2025
STP4D: Spatio-Temporal-Prompt Consistent Modeling for Text-to-4D Gaussian Splatting
STP4D: Spatio-Temporal-Prompt Consistent Modeling for Text-to-4D Gaussian Splatting
Yunze Deng
Haijun Xiong
Bin Feng
Xueliang Wang
Wei Liu
3DGS
47
0
0
25 Apr 2025
OctGPT: Octree-based Multiscale Autoregressive Models for 3D Shape Generation
OctGPT: Octree-based Multiscale Autoregressive Models for 3D Shape Generation
Si-Tong Wei
Rui-Huan Wang
Chuan-Zhi Zhou
Baoquan Chen
Peng-Shuai Wang
39
2
0
14 Apr 2025
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
J. Huang
Baoxiong Jia
Yansen Wang
Ziyu Zhu
Xiongkun Linghu
Qing Li
Song-Chun Zhu
Siyuan Huang
87
3
0
28 Mar 2025
TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Xuying Zhang
Yutong Liu
Yangguang Li
Renrui Zhang
Yong Liu
...
Wanli Ouyang
Zhiwei Xiong
Peng Gao
Qibin Hou
Ming-Ming Cheng
127
3
0
13 Mar 2025
GenVDM: Generating Vector Displacement Maps From a Single Image
GenVDM: Generating Vector Displacement Maps From a Single Image
Yuezhi Yang
Qimin Chen
Vladimir G. Kim
S. Chaudhuri
Qixing Huang
Z. Chen
3DGS
VGen
29
1
0
01 Mar 2025
UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting
UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting
Haoyuan Li
Yanpeng Zhou
Tao Tang
Jifei Song
Yihan Zeng
Michael C. Kampffmeyer
Hang Xu
Xiaodan Liang
3DGS
67
1
0
25 Feb 2025
UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping
UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping
Aashish Rai
Dilin Wang
Mihir Jain
N. Sarafianos
Arthur Chen
Srinath Sridhar
Aayush Prakash
3DGS
74
1
0
03 Feb 2025
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
Chenguo Lin
Panwang Pan
Bangbang Yang
Zeming Li
Yadong Mu
3DGS
76
7
0
28 Jan 2025
OneLLM: One Framework to Align All Modalities with Language
OneLLM: One Framework to Align All Modalities with Language
Jiaming Han
Kaixiong Gong
Yiyuan Zhang
Jiaqi Wang
Kaipeng Zhang
Dahua Lin
Yu Qiao
Peng Gao
Xiangyu Yue
MLLM
106
109
0
10 Jan 2025
Taming Feed-forward Reconstruction Models as Latent Encoders for 3D Generative Models
Suttisak Wizadwongsa
Jinfan Zhou
Edward Li
Jeong Joon Park
3DV
70
0
0
31 Dec 2024
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
Yongwei Chen
Yushi Lan
Shangchen Zhou
Tengfei Wang
Xingang Pan
102
5
0
25 Nov 2024
Organizing Unstructured Image Collections using Natural Language
Organizing Unstructured Image Collections using Natural Language
Mingxuan Liu
Zhun Zhong
Jun Li
Gianni Franchi
Subhankar Roy
Elisa Ricci
VLM
44
3
0
07 Oct 2024
Diffusion Models in 3D Vision: A Survey
Diffusion Models in 3D Vision: A Survey
Zhen Wang
Dongyuan Li
Renhe Jiang
Tianyu He
Jiang Bian
Renhe Jiang
MedIm
70
4
0
07 Oct 2024
Atlas Gaussians Diffusion for 3D Generation
Atlas Gaussians Diffusion for 3D Generation
Haitao Yang
Yuan Dong
Hanwen Jiang
Dejia Xu
Georgios Pavlakos
Qixing Huang
3DGS
81
3
0
23 Aug 2024
Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation
Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation
Utkarsh Nath
Rajeev Goel
Eun Som Jeon
Changhoon Kim
Kyle Min
Yezhou Yang
Yingzhen Yang
Pavan Turaga
51
1
0
12 Aug 2024
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Han-Hung Lee
Yiming Zhang
Angel X. Chang
3DPC
48
3
0
17 Jun 2024
VP-LLM: Text-Driven 3D Volume Completion with Large Language Models
  through Patchification
VP-LLM: Text-Driven 3D Volume Completion with Large Language Models through Patchification
Jianmeng Liu
Yichen Liu
Yuyao Zhang
Zeyuan Meng
Yu-Wing Tai
Chi-Keung Tang
49
0
0
08 Jun 2024
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D
  Data
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data
Qihao Liu
Yi Zhang
Song Bai
Adam Kortylewski
Alan Yuille
42
9
0
06 Jun 2024
Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting
Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting
Zhiqi Li
Yiming Chen
Lingzhe Zhao
Peidong Liu
DiffM
3DGS
61
17
0
15 Mar 2024
SPAD : Spatially Aware Multiview Diffusers
SPAD : Spatially Aware Multiview Diffusers
Yash Kant
Ziyi Wu
Michael Vasilkovsky
Guocheng Qian
Jian Ren
R. A. Guler
Guohao Li
Sergey Tulyakov
Igor Gilitschenski
Aliaksandr Siarohin
DiffM
24
35
0
07 Feb 2024
Open-Universe Indoor Scene Generation using LLM Program Synthesis and
  Uncurated Object Databases
Open-Universe Indoor Scene Generation using LLM Program Synthesis and Uncurated Object Databases
Rio Aguina-Kang
Maxim Gumin
Do Heon Han
Stewart Morris
Seung Jean Yoo
Aditya Ganeshan
R. K. Jones
Qiuhong Anna Wei
Kailiang Fu
Daniel E. Ritchie
3DV
50
24
0
05 Feb 2024
Uni3DL: Unified Model for 3D and Language Understanding
Uni3DL: Unified Model for 3D and Language Understanding
Xiang Li
Jian Ding
Zhaoyang Chen
Mohamed Elhoseiny
38
3
0
05 Dec 2023
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding,
  Reasoning, and Planning
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Sijin Chen
Xin Chen
C. Zhang
Mingsheng Li
Gang Yu
Hao Fei
Erik Cambria
Jiayuan Fan
Tao Chen
MLLM
29
82
0
30 Nov 2023
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction
  Model
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Yinghao Xu
Hao Tan
Fujun Luan
Sai Bi
Peng Wang
...
Zifan Shi
Kalyan Sunkavalli
Gordon Wetzstein
Zexiang Xu
Kai Zhang
43
153
0
15 Nov 2023
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large
  Reconstruction Model
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Jiahao Li
Hao Tan
Kai Zhang
Zexiang Xu
Fujun Luan
Yinghao Xu
Yicong Hong
Kalyan Sunkavalli
Greg Shakhnarovich
Sai Bi
59
254
0
10 Nov 2023
3D-Aware Visual Question Answering about Parts, Poses and Occlusions
3D-Aware Visual Question Answering about Parts, Poses and Occlusions
Xingrui Wang
Wufei Ma
Zhuowan Li
Adam Kortylewski
Alan Yuille
CoGe
27
12
0
27 Oct 2023
Shap-E: Generating Conditional 3D Implicit Functions
Shap-E: Generating Conditional 3D Implicit Functions
Heewoo Jun
Alex Nichol
DiffM
203
310
0
03 May 2023
3DGen: Triplane Latent Diffusion for Textured Mesh Generation
3DGen: Triplane Latent Diffusion for Textured Mesh Generation
Anchit Gupta
Wenhan Xiong
Yixin Nie
Anchit Gupta
Barlas Oğuz
DiffM
103
157
0
09 Mar 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,154
0
28 Jan 2022
ABO: Dataset and Benchmarks for Real-World 3D Object Understanding
ABO: Dataset and Benchmarks for Real-World 3D Object Understanding
Jasmine Collins
Shubham Goel
Kenan Deng
Achleshwar Luthra
Leon L. Xu
...
T. F. Y. Vicente
T. Dideriksen
H. Arora
M. Guillaumin
Jitendra Malik
154
218
0
12 Oct 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,796
0
24 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
299
1,084
0
17 Feb 2021
Neural Baby Talk
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
200
434
0
27 Mar 2018
1