ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.02765
  4. Cited By

Visual Large Language Models for Generalized and Specialized Applications

6 January 2025
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
    VLM
ArXiv (abs)PDFHTML

Papers citing "Visual Large Language Models for Generalized and Specialized Applications"

24 / 24 papers shown
Title
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
Guohuan Xie
Syed Ariff Syed Hesham
Wenya Guo
Bing Li
Ming-Ming Cheng
Guolei Sun
Yun-Hai Liu
39
0
0
16 Jun 2025
Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future Prospects
Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future Prospects
Yixin Cui
Haotian Lin
Shuo Yang
Yixiao Wang
Yanjun Huang
Hong Chen
LM&RoLRMELM
143
1
0
26 May 2025
Cloud-Based AI Systems: Leveraging Large Language Models for Intelligent Fault Detection and Autonomous Self-Healing
Cloud-Based AI Systems: Leveraging Large Language Models for Intelligent Fault Detection and Autonomous Self-Healing
Cheng Ji
Huaiying Luo
100
7
0
16 May 2025
A Grounded Memory System For Smart Personal Assistants
A Grounded Memory System For Smart Personal Assistants
Felix Ocker
J. Deigmöller
Pavel Smirnov
Julian Eggert
83
0
0
09 May 2025
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges
Ranjan Sapkota
Yang Cao
Konstantinos I. Roumeliotis
Manoj Karkee
LM&Ro
424
6
0
07 May 2025
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes
S. Linok
Vadim Semenov
Anastasia Trunova
Oleg Bulichev
Dmitry A. Yudin
119
0
0
06 May 2025
FaceInsight: A Multimodal Large Language Model for Face Perception
FaceInsight: A Multimodal Large Language Model for Face Perception
Jingzhi Li
Changjiang Luo
Ruoyu Chen
Hua Zhang
Wenqi Ren
Jianhou Gan
Xiaochun Cao
CVBMLRM
167
0
0
22 Apr 2025
Window Token Concatenation for Efficient Visual Large Language Models
Window Token Concatenation for Efficient Visual Large Language Models
Yifan Li
Wentao Bao
Botao Ye
Zhen Tan
Tianlong Chen
Huan Liu
Yu Kong
VLM
111
0
0
05 Apr 2025
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
Aabid Karim
Abdul Karim
Bhoomika Lohana
Matt Keon
Jaswinder Singh
A. Sattar
112
2
0
23 Mar 2025
Advancing Problem-Based Learning in Biomedical Engineering in the Era of Generative AI
Advancing Problem-Based Learning in Biomedical Engineering in the Era of Generative AI
Micky C. Nnamdi
J. Ben Tamo
Wenqi Shi
M. D. Wang
AI4CE
91
0
0
20 Mar 2025
A Review on Large Language Models for Visual Analytics
A Review on Large Language Models for Visual Analytics
Navya Sonal Agarwal
Sanjay Kumar Sonbhadra
125
2
0
19 Mar 2025
DriveLM: Driving with Graph Visual Question Answering
DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima
Katrin Renz
Kashyap Chitta
Lawrence Yunliang Chen
Hanxue Zhang
Chengen Xie
Jens Beißwenger
Ping Luo
Andreas Geiger
Hongyang Li
339
228
0
17 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLMVLMLRM
376
69
0
03 Jan 2025
Improving Medical Large Vision-Language Models with Abnormal-Aware Feedback
Improving Medical Large Vision-Language Models with Abnormal-Aware Feedback
Yucheng Zhou
Lingran Song
Jianbing Shen
106
9
0
02 Jan 2025
Interpreting Object-level Foundation Models via Visual Precision Search
Interpreting Object-level Foundation Models via Visual Precision Search
Ruoyu Chen
Siyuan Liang
Jingzhi Li
Shiming Liu
Maosen Li
Zheng Huang
Qichuan Geng
Xiaochun Cao
FAtt
243
5
0
25 Nov 2024
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Ke Yang
Yao Liu
Sapana Chaudhary
Rasool Fakoor
Pratik Chaudhari
George Karypis
Huzefa Rangwala
LLMAGLM&Ro
201
32
0
17 Oct 2024
Towards Interpreting Visual Information Processing in Vision-Language Models
Towards Interpreting Visual Information Processing in Vision-Language Models
Clement Neo
Luke Ong
Philip Torr
Mor Geva
David M. Krueger
Fazl Barez
167
20
0
09 Oct 2024
Aria: An Open Multimodal Native Mixture-of-Experts Model
Aria: An Open Multimodal Native Mixture-of-Experts Model
Dongxu Li
Yudong Liu
Haoning Wu
Yue Wang
Zhiqi Shen
...
Lihuan Zhang
Hanshu Yan
Guoyin Wang
Bei Chen
Junnan Li
MoE
172
75
0
08 Oct 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
368
61
0
23 May 2024
Hallucination is Inevitable: An Innate Limitation of Large Language Models
Hallucination is Inevitable: An Innate Limitation of Large Language Models
Ziwei Xu
Sanjay Jain
Mohan S. Kankanhalli
HILMLRM
193
277
0
22 Jan 2024
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
Hao Sha
Yao Mu
Yuxuan Jiang
Li Chen
Chenfeng Xu
Ping Luo
Shengbo Eben Li
Masayoshi Tomizuka
Wei Zhan
Mingyu Ding
275
184
0
04 Oct 2023
Language Prompt for Autonomous Driving
Language Prompt for Autonomous Driving
Dongming Wu
Wencheng Han
Tiancai Wang
Yingfei Liu
Cheng-zhong Xu
Jianbing Shen
Jianbing Shen
VLM
140
89
0
08 Sep 2023
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shilong Zhang
Pei Sun
Shoufa Chen
Min Xiao
Wenqi Shao
Wenwei Zhang
Yu Liu
Kai-xiang Chen
Ping Luo
MLLMVLM
193
251
0
07 Jul 2023
Valley: Video Assistant with Large Language model Enhanced abilitY
Valley: Video Assistant with Large Language model Enhanced abilitY
Ruipu Luo
Ziwang Zhao
Min Yang
Junwei Dong
Da Li
Pengcheng Lu
Tao Wang
Linmei Hu
Ming-Hui Qiu
MLLM
154
211
0
12 Jun 2023
1