Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2306.02858
Cited By
v1
v2
v3
v4 (latest)
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
5 June 2023
Hang Zhang
Xin Li
Lidong Bing
MLLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (19 upvotes)
Papers citing
"Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding"
19 / 669 papers shown
Title
Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes
Zehan Wang
Haifeng Huang
Yang Zhao
Ziang Zhang
Zhou Zhao
191
103
0
17 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
International Conference on Learning Representations (ICLR), 2023
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLM
ALM
283
175
0
14 Aug 2023
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Yang Zhao
Zhijie Lin
Daquan Zhou
Zilong Huang
Jiashi Feng
Bingyi Kang
MLLM
154
123
0
17 Jul 2023
SVIT: Scaling up Visual Instruction Tuning
Bo Zhao
Boya Wu
Muyang He
Tiejun Huang
MLLM
241
153
0
09 Jul 2023
Exploring and Characterizing Large Language Models For Embedded System Development and Debugging
Zachary Englhardt
Rong-Hua Li
Dilini Nissanka
Zhihan Zhang
Girish Narayanswamy
Joseph Breda
Xin Liu
Shwetak N. Patel
Vikram Iyer
189
31
0
07 Jul 2023
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Yan Zeng
Hanbo Zhang
Jiani Zheng
Jiangnan Xia
Guoqiang Wei
Yang Wei
Yuchen Zhang
Tao Kong
MLLM
226
88
0
05 Jul 2023
A Survey on Multimodal Large Language Models
National Science Review (NSR), 2023
Xinglong Mao
Chaoyou Fu
Zhengye Zhang
Ke Li
Xing Sun
Tong Xu
Enhong Chen
MLLM
LRM
345
909
0
23 Jun 2023
LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning
Yunlong Tang
Jinrui Zhang
Xiangchen Wang
Teng Wang
Feng Zheng
VLM
156
10
0
17 Jun 2023
Valley: Video Assistant with Large Language model Enhanced abilitY
Ruipu Luo
Ziwang Zhao
Min Yang
Junwei Dong
Da Li
Pengcheng Lu
Tao Wang
Linmei Hu
Ming-Hui Qiu
MLLM
327
246
0
12 Jun 2023
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Muhammad Maaz
H. Rasheed
Salman Khan
Fahad Shahbaz Khan
MLLM
301
909
0
08 Jun 2023
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Haiyang Xu
Qinghao Ye
Xuan-Wei Wu
Mingshi Yan
Yuan Miao
...
Qingfang Qian
Maofei Que
Ji Zhang
Xiaoyan Zeng
Feiyan Huang
VLM
MLLM
133
34
0
07 Jun 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
International Conference on Language Resources and Evaluation (LREC), 2023
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Amélie Reymond
LRM
272
8
0
21 May 2023
VideoChat: Chat-Centric Video Understanding
Kunchang Li
Yinan He
Yi Wang
Yizhuo Li
Wen Wang
Ping Luo
Yali Wang
Limin Wang
Yu Qiao
MLLM
303
764
0
10 May 2023
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models
Jiashuo Sun
Yi Luo
Yeyun Gong
Chen Lin
Yelong Shen
Jian Guo
Nan Duan
LRM
288
26
0
23 Apr 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
International Conference on Learning Representations (ICLR), 2023
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLM
MLLM
368
2,605
0
20 Apr 2023
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision
International Conference on Learning Representations (ICLR), 2023
Jiani Huang
Ziyang Li
Mayur Naik
Ser-Nam Lim
484
9
0
15 Apr 2023
Retrieving Multimodal Information for Augmented Generation: A Survey
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ruochen Zhao
Hailin Chen
Weishi Wang
Fangkai Jiao
Do Xuan Long
...
Bosheng Ding
Xiaobao Guo
Minzhi Li
Xingxuan Li
Shafiq Joty
302
119
0
20 Mar 2023
Beyond Grounding: Extracting Fine-Grained Event Hierarchies Across Modalities
AAAI Conference on Artificial Intelligence (AAAI), 2022
Hammad A. Ayyubi
Christopher Thomas
Lovish Chum
R. Lokesh
Long Chen
...
Xudong Lin
Xuande Feng
Jaywon Koo
Sounak Ray
Shih-Fu Chang
AI4TS
201
0
0
14 Jun 2022
Explainable Artificial Intelligence for Autonomous Driving: A Comprehensive Overview and Field Guide for Future Research Directions
IEEE Access (IEEE Access), 2021
Shahin Atakishiyev
Mohammad Salameh
Hengshuai Yao
Randy Goebel
433
183
0
21 Dec 2021
Previous
1
2
3
...
12
13
14