ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.13335
  4. Cited By
Multi-modal Large Language Model Enhanced Pseudo 3D Perception Framework
  for Visual Commonsense Reasoning

Multi-modal Large Language Model Enhanced Pseudo 3D Perception Framework for Visual Commonsense Reasoning

30 January 2023
Jian Zhu
Hanli Wang
Miaojing Shi
    LRM
ArXivPDFHTML

Papers citing "Multi-modal Large Language Model Enhanced Pseudo 3D Perception Framework for Visual Commonsense Reasoning"

3 / 3 papers shown
Title
Visual and textual prompts for enhancing emotion recognition in video
Visual and textual prompts for enhancing emotion recognition in video
Zhifeng Wang
Qixuan Zhang
Peter Zhang
Wenjia Niu
Kaihao Zhang
Ramesh Sankaranarayana
Sabrina Caldwell
Tom Gedeon
47
0
0
24 Apr 2025
LLM-EvRep: Learning an LLM-Compatible Event Representation Using a Self-Supervised Framework
LLM-EvRep: Learning an LLM-Compatible Event Representation Using a Self-Supervised Framework
Zongyou Yu
Qiang Qu
Qian Zhang
Nan Zhang
Xiaoming Chen
96
3
0
21 Feb 2025
MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained
  Vision-Language Understanding
MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Yue Cao
Yangzhou Liu
Zhe Chen
Guangchen Shi
Wenhai Wang
Danhuai Zhao
Tong Lu
51
5
0
15 Oct 2024
1