ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.06685
  4. Cited By
Causal-CoG: A Causal-Effect Look at Context Generation for Boosting
  Multi-modal Language Models

Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models

9 December 2023
Shitian Zhao
Zhuowan Li
Yadong Lu
Alan Yuille
Yan Wang
    LRM
ArXivPDFHTML

Papers citing "Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models"

11 / 11 papers shown
Title
ReForm-Eval: Evaluating Large Vision Language Models via Unified
  Re-Formulation of Task-Oriented Benchmarks
ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks
Zejun Li
Ye Wang
Mengfei Du
Qingwen Liu
Binhao Wu
...
Zhihao Fan
Jie Fu
Jingjing Chen
Xuanjing Huang
Zhongyu Wei
61
15
0
04 Oct 2023
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Maciej Besta
Nils Blach
Aleš Kubíček
Robert Gerstenberger
Michal Podstawski
...
Joanna Gajda
Tomasz Lehmann
H. Niewiadomski
Piotr Nyczyk
Torsten Hoefler
LRM
AI4CE
LM&Ro
105
665
0
18 Aug 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
250
1,235
0
20 Sep 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
344
3,532
0
29 Apr 2022
Winoground: Probing Vision and Language Models for Visio-Linguistic
  Compositionality
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
Tristan Thrush
Ryan Jiang
Max Bartolo
Amanpreet Singh
Adina Williams
Douwe Kiela
Candace Ross
CoGe
87
424
0
07 Apr 2022
VLMo: Unified Vision-Language Pre-Training with
  Mixture-of-Modality-Experts
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
Hangbo Bao
Wenhui Wang
Li Dong
Qiang Liu
Owais Khan Mohammed
Kriti Aggarwal
Subhojit Som
Furu Wei
VLM
MLLM
MoE
69
553
0
03 Nov 2021
Align before Fuse: Vision and Language Representation Learning with
  Momentum Distillation
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
Junnan Li
Ramprasaath R. Selvaraju
Akhilesh Deepak Gotmare
Shafiq Joty
Caiming Xiong
Guosheng Lin
FaML
167
1,951
0
16 Jul 2021
Multimodal Few-Shot Learning with Frozen Language Models
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
149
777
0
25 Jun 2021
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for
  Image Captioning
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
VLM
75
226
0
20 Feb 2021
VizWiz Grand Challenge: Answering Visual Questions from Blind People
VizWiz Grand Challenge: Answering Visual Questions from Blind People
Danna Gurari
Qing Li
Abigale Stangl
Anhong Guo
Chi Lin
Kristen Grauman
Jiebo Luo
Jeffrey P. Bigham
CoGe
82
847
0
22 Feb 2018
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
322
3,224
0
02 Dec 2016
1