Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.10868
Cited By
Retrieving Multimodal Information for Augmented Generation: A Survey
20 March 2023
Ruochen Zhao
Hailin Chen
Weishi Wang
Fangkai Jiao
Do Xuan Long
Chengwei Qin
Bosheng Ding
Xiaobao Guo
Minzhi Li
Xingxuan Li
Shafiq R. Joty
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Retrieving Multimodal Information for Augmented Generation: A Survey"
31 / 81 papers shown
Title
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Wenhao Wu
Haipeng Luo
Bo Fang
Jingdong Wang
Wanli Ouyang
98
80
0
31 Dec 2022
Is GPT-3 a Good Data Annotator?
Bosheng Ding
Chengwei Qin
Linlin Liu
Yew Ken Chia
Shafiq R. Joty
Boyang Albert Li
Lidong Bing
26
233
0
20 Dec 2022
ComFact: A Benchmark for Linking Contextual Commonsense Knowledge
Silin Gao
Jena D. Hwang
Saya Kanno
Hiromi Wakaki
Yuki Mitsufuji
Antoine Bosselut
HILM
38
16
0
23 Oct 2022
Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge
Kaixin Ma
Hao Cheng
Xiaodong Liu
Eric Nyberg
Jianfeng Gao
LRM
141
15
0
22 Oct 2022
Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination
Yue Yang
Wenlin Yao
Hongming Zhang
Xiaoyang Wang
Dong Yu
Jianshu Chen
VLM
39
22
0
21 Oct 2022
Retrieval Augmented Visual Question Answering with Outside Knowledge
Weizhe Lin
Bill Byrne
RALM
74
69
0
07 Oct 2022
Visualize Before You Write: Imagination-Guided Open-Ended Text Generation
Wanrong Zhu
An Yan
Yujie Lu
Wenda Xu
Qing Guo
Miguel P. Eckstein
William Yang Wang
82
37
0
07 Oct 2022
Binding Language Models in Symbolic Languages
Zhoujun Cheng
Tianbao Xie
Peng Shi
Chengzu Li
Rahul Nadkarni
...
Dragomir R. Radev
Mari Ostendorf
Luke Zettlemoyer
Noah A. Smith
Tao Yu
LMTD
122
198
0
06 Oct 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
125
161
0
29 Sep 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
211
1,106
0
20 Sep 2022
OPERA: Harmonizing Task-Oriented Dialogs and Information Seeking Experience
Miaoran Li
Baolin Peng
Jianfeng Gao
Zhu Zhang
69
9
0
24 Jun 2022
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
CLIP
131
76
0
26 May 2022
Autoformalization with Large Language Models
Yuhuai Wu
Albert Q. Jiang
Wenda Li
M. Rabe
Charles Staats
M. Jamnik
Christian Szegedy
AI4CE
110
157
0
25 May 2022
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Zhenhailong Wang
Manling Li
Ruochen Xu
Luowei Zhou
Jie Lei
...
Chenguang Zhu
Derek Hoiem
Shih-Fu Chang
Joey Tianyi Zhou
Heng Ji
MLLM
VLM
170
137
0
22 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
319
11,953
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
389
8,495
0
28 Jan 2022
Relational Memory Augmented Language Models
Qi Liu
Dani Yogatama
Phil Blunsom
KELM
RALM
67
32
0
24 Jan 2022
Prix-LM: Pretraining for Multilingual Knowledge Base Construction
Wenxuan Zhou
Fangyu Liu
Ivan Vulić
Nigel Collier
Muhao Chen
KELM
72
18
0
16 Oct 2021
GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems
Bosheng Ding
Junjie Hu
Lidong Bing
Sharifah Aljunied Mahani
Shafiq R. Joty
Luo Si
C. Miao
45
41
0
14 Oct 2021
LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5
Chengwei Qin
Shafiq R. Joty
CLL
178
98
0
14 Oct 2021
Knowledge-Enhanced Evidence Retrieval for Counterargument Generation
Yohan Jo
Haneul Yoo
Jinyeong Bak
Alice H. Oh
Chris Reed
Eduard H. Hovy
RALM
40
12
0
19 Sep 2021
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
180
402
0
10 Sep 2021
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Shafiq R. Joty
S. Hoi
238
1,489
0
02 Sep 2021
MusCaps: Generating Captions for Music Audio
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
30
36
0
24 Apr 2021
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
Xiaohan Wang
Linchao Zhu
Yi Yang
170
170
0
20 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
313
3,708
0
11 Feb 2021
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
Hung Le
Doyen Sahoo
Nancy F. Chen
S. Hoi
40
30
0
20 Oct 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
231
778
0
17 Sep 2019
Retrieval-Based Neural Code Generation
Shirley Anugrah Hayati
R. Olivier
Pravalika Avvaru
Pengcheng Yin
A. Tomasic
Graham Neubig
134
110
0
29 Aug 2018
Previous
1
2