ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13267
  4. Cited By
Enhance Reasoning Ability of Visual-Language Models via Large Language
  Models

Enhance Reasoning Ability of Visual-Language Models via Large Language Models

22 May 2023
Yueting Yang
Xintong Zhang
Wenjuan Han
    VLM
    ReLM
    LRM
ArXivPDFHTML

Papers citing "Enhance Reasoning Ability of Visual-Language Models via Large Language Models"

18 / 18 papers shown
Title
MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Kevin Qinghong Lin
E. Azarnasab
Faisal Ahmed
Zicheng Liu
Ce Liu
Michael Zeng
Lijuan Wang
ReLM
KELM
LRM
85
383
0
20 Mar 2023
In-Context Learning with Many Demonstration Examples
In-Context Learning with Many Demonstration Examples
Mukai Li
Shansan Gong
Jiangtao Feng
Yiheng Xu
Jinchao Zhang
Zhiyong Wu
Lingpeng Kong
80
38
0
09 Feb 2023
Teaching Small Language Models to Reason
Teaching Small Language Models to Reason
Lucie Charlotte Magister
Jonathan Mallinson
Jakub Adamek
Eric Malmi
Aliaksei Severyn
LRM
AI4CE
ReLM
158
268
0
16 Dec 2022
Automatic Chain of Thought Prompting in Large Language Models
Automatic Chain of Thought Prompting in Large Language Models
Zhuosheng Zhang
Aston Zhang
Mu Li
Alexander J. Smola
ReLM
LRM
141
618
0
07 Oct 2022
MaPLe: Multi-modal Prompt Learning
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
Fahad Shahbaz Khan
VPVLM
VLM
251
565
0
06 Oct 2022
What Can Transformers Learn In-Context? A Case Study of Simple Function
  Classes
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Shivam Garg
Dimitris Tsipras
Percy Liang
Gregory Valiant
118
505
0
01 Aug 2022
Rationale-Augmented Ensembles in Language Models
Rationale-Augmented Ensembles in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Denny Zhou
ReLM
LRM
85
126
0
02 Jul 2022
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
Dustin Schwenk
Apoorv Khandelwal
Christopher Clark
Kenneth Marino
Roozbeh Mottaghi
58
536
0
03 Jun 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
349
3,532
0
29 Apr 2022
Multi-Stage Prompting for Knowledgeable Dialogue Generation
Multi-Stage Prompting for Knowledgeable Dialogue Generation
Zihan Liu
M. Patwary
R. Prenger
Shrimai Prabhumoye
Ming-Yu Liu
Mohammad Shoeybi
Bryan Catanzaro
41
50
0
16 Mar 2022
Conditional Prompt Learning for Vision-Language Models
Conditional Prompt Learning for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VLM
CLIP
VPVLM
103
1,348
0
10 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
501
4,340
0
28 Jan 2022
MetaICL: Learning to Learn In Context
MetaICL: Learning to Learn In Context
Sewon Min
M. Lewis
Luke Zettlemoyer
Hannaneh Hajishirzi
LRM
197
488
0
29 Oct 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
464
2,394
0
02 Sep 2021
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang
Jiahui Yu
Adams Wei Yu
Zihang Dai
Yulia Tsvetkov
Yuan Cao
VLM
MLLM
114
796
0
24 Aug 2021
XGPT: Cross-modal Generative Pre-Training for Image Captioning
XGPT: Cross-modal Generative Pre-Training for Image Captioning
Qiaolin Xia
Haoyang Huang
Nan Duan
Dongdong Zhang
Lei Ji
Zhifang Sui
Edward Cui
Taroon Bharti
Xin Liu
Ming Zhou
MLLM
VLM
70
75
0
03 Mar 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
339
937
0
24 Sep 2019
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
322
3,235
0
02 Dec 2016
1