ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.10908
  4. Cited By
CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update

CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update

18 December 2023
Zhi Gao
Yuntao Du
Xintong Zhang
Xiaojian Ma
Wenjuan Han
Song-Chun Zhu
Qing Li
    LLMAG
    VLM
ArXivPDFHTML

Papers citing "CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update"

28 / 28 papers shown
Title
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Pengxiang Li
Zhi Gao
Bofei Zhang
Yapeng Mi
Xiaojian Ma
...
Tao Yuan
Yuwei Wu
Yunde Jia
Song-Chun Zhu
Qing Li
LLMAG
70
0
0
30 Apr 2025
Multimodal Agricultural Agent Architecture (MA3): A New Paradigm for Intelligent Agricultural Decision-Making
Multimodal Agricultural Agent Architecture (MA3): A New Paradigm for Intelligent Agricultural Decision-Making
Zhuoning Xu
Jian Xu
M. Zhang
P. Wang
Chao Deng
Cheng-Lin Liu
28
0
0
07 Apr 2025
Building LLM Agents by Incorporating Insights from Computer Systems
Building LLM Agents by Incorporating Insights from Computer Systems
Yapeng Mi
Zhi Gao
Xiaojian Ma
Qing Li
LLMAG
44
0
0
06 Apr 2025
CoSTA∗\ast∗: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
Advait Gupta
NandaKiran Velaga
Dang Nguyen
Dinesh Manocha
DiffM
66
0
0
13 Mar 2025
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering
Zeqing Wang
Wentao Wan
Qiqing Lao
Runmeng Chen
Minjie Lang
Keze Wang
Liang Lin
Liang Lin
LRM
103
3
0
17 Feb 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
88
11
0
06 Jan 2025
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
Duo Wu
Yufei Guo
Yuan Meng
Yanning Zhang
Le Sun
Zhi Wang
189
0
0
25 Nov 2024
Generalist Virtual Agents: A Survey on Autonomous Agents Across Digital Platforms
Minghe Gao
Wendong Bu
Bingchen Miao
Yang Wu
Yunfei Li
Juncheng Billy Li
Siliang Tang
Qi Wu
Yueting Zhuang
Meng Wang
LM&Ro
45
3
0
17 Nov 2024
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions
Changle Qu
Sunhao Dai
Xiaochi Wei
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Jun Xu
Ji-Rong Wen
60
9
0
10 Oct 2024
Collaborative Evolving Strategy for Automatic Data-Centric Development
Collaborative Evolving Strategy for Automatic Data-Centric Development
Xu Yang
Haotian Chen
Wenjun Feng
Haoxue Wang
Zeqi Ye
Xinjie Shen
Xiao Yang
Shizhao Sun
Weiqing Liu
Jiang Bian
32
2
0
26 Jul 2024
FIRE: A Dataset for Feedback Integration and Refinement Evaluation of
  Multimodal Models
FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models
Pengxiang Li
Zhi Gao
Bofei Zhang
Tao Yuan
Yuwei Wu
Mehrtash Harandi
Yunde Jia
Song-Chun Zhu
Qing Li
VLM
MLLM
48
3
0
16 Jul 2024
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
Haozhe Zhao
Xiaojian Ma
Liang Chen
Shuzheng Si
Rujie Wu
Kaikai An
Peiyu Yu
Minjia Zhang
Qing Li
Baobao Chang
36
43
0
07 Jul 2024
Tool Learning with Large Language Models: A Survey
Tool Learning with Large Language Models: A Survey
Changle Qu
Sunhao Dai
Xiaochi Wei
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Jun Xu
Jirong Wen
LLMAG
34
80
0
28 May 2024
UDKAG: Augmenting Large Vision-Language Models with Up-to-Date Knowledge
UDKAG: Augmenting Large Vision-Language Models with Up-to-Date Knowledge
Chuanhao Li
Zhen Li
Chenchen Jing
Shuo Liu
Wenqi Shao
Yuwei Wu
Ping Luo
Yu Qiao
Kaipeng Zhang
ELM
31
3
0
23 May 2024
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
Yue Fan
Xiaojian Ma
Rujie Wu
Yuntao Du
Jiaqi Li
Zhi Gao
Qing Li
VLM
LLMAG
46
55
0
18 Mar 2024
Budget-Constrained Tool Learning with Planning
Budget-Constrained Tool Learning with Planning
Yuanhang Zheng
Peng Li
Mingshi Yan
Ji Zhang
Fei Huang
Yang Liu
37
3
0
25 Feb 2024
Large Multimodal Agents: A Survey
Large Multimodal Agents: A Survey
Junlin Xie
Zhihong Chen
Ruifei Zhang
Xiang Wan
Guanbin Li
LM&Ro
LLMAG
37
39
0
23 Feb 2024
MM-LLMs: Recent Advances in MultiModal Large Language Models
MM-LLMs: Recent Advances in MultiModal Large Language Models
Duzhen Zhang
Yahan Yu
Jiahua Dong
Chenxing Li
Dan Su
Chenhui Chu
Dong Yu
OffRL
LRM
52
179
0
24 Jan 2024
ControlLLM: Augment Language Models with Tools by Searching on Graphs
ControlLLM: Augment Language Models with Tools by Searching on Graphs
Zhaoyang Liu
Zeqiang Lai
Zhangwei Gao
Erfei Cui
Ziheng Li
...
Lewei Lu
Qifeng Chen
Yu Qiao
Jifeng Dai
Wenhai Wang
MLLM
126
30
0
26 Oct 2023
GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for
  Reasoning Problems
GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems
Kaya Stechly
Matthew Marquez
Subbarao Kambhampati
LRM
165
84
0
19 Oct 2023
Self-RAG: Learning to Retrieve, Generate, and Critique through
  Self-Reflection
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Akari Asai
Zeqiu Wu
Yizhong Wang
Avirup Sil
Hannaneh Hajishirzi
RALM
165
631
0
17 Oct 2023
Generative Agents: Interactive Simulacra of Human Behavior
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
232
1,742
0
07 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
275
4,244
0
30 Jan 2023
MaPLe: Multi-modal Prompt Learning
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
206
531
0
06 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
250
1,073
0
05 Oct 2022
What does a platypus look like? Generating customized prompts for
  zero-shot image classification
What does a platypus look like? Generating customized prompts for zero-shot image classification
Sarah M Pratt
Ian Covert
Rosanne Liu
Ali Farhadi
VLM
131
212
0
07 Sep 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
348
2,271
0
02 Sep 2021
1