Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.13335
Cited By
Multi-modal Large Language Model Enhanced Pseudo 3D Perception Framework for Visual Commonsense Reasoning
30 January 2023
Jian Zhu
Hanli Wang
Miaojing Shi
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multi-modal Large Language Model Enhanced Pseudo 3D Perception Framework for Visual Commonsense Reasoning"
3 / 3 papers shown
Title
Visual and textual prompts for enhancing emotion recognition in video
Zhifeng Wang
Qixuan Zhang
Peter Zhang
Wenjia Niu
Kaihao Zhang
Ramesh Sankaranarayana
Sabrina Caldwell
Tom Gedeon
47
0
0
24 Apr 2025
LLM-EvRep: Learning an LLM-Compatible Event Representation Using a Self-Supervised Framework
Zongyou Yu
Qiang Qu
Qian Zhang
Nan Zhang
Xiaoming Chen
96
3
0
21 Feb 2025
MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Yue Cao
Yangzhou Liu
Zhe Chen
Guangchen Shi
Wenhai Wang
Danhuai Zhao
Tong Lu
51
5
0
15 Oct 2024
1