ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.02858
  4. Cited By
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video
  Understanding
v1v2v3v4 (latest)

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
5 June 2023
Hang Zhang
Xin Li
Lidong Bing
    MLLM
ArXiv (abs)PDFHTMLHuggingFace (19 upvotes)

Papers citing "Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding"

25 / 875 papers shown
Title
Fine-grained Audio-Visual Joint Representations for Multimodal Large
  Language Models
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models
Guangzhi Sun
Wenyi Yu
Changli Tang
Xianzhao Chen
Tian Tan
Wei Li
Lu Lu
Zejun Ma
Chao Zhang
192
14
0
09 Oct 2023
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
Hao Sha
Yao Mu
Yuxuan Jiang
Li Chen
Chenfeng Xu
Ping Luo
Shengbo Eben Li
Masayoshi Tomizuka
Wei Zhan
Mingyu Ding
533
213
0
04 Oct 2023
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Avamarie Brueggeman
Andrea Madotto
Mohammad Kachuee
Tushar Nagarajan
Matt Smith
...
Peyman Heidari
Yue Liu
Kavya Srinet
Babak Damavandi
Anuj Kumar
MLLM
250
109
0
27 Sep 2023
Knowledge-Guided Short-Context Action Anticipation in Human-Centric
  Videos
Knowledge-Guided Short-Context Action Anticipation in Human-Centric Videos
Sarthak Bhagat
Simon Stepputtis
Joseph Campbell
Katia Sycara
170
4
0
12 Sep 2023
Large Content And Behavior Models To Understand, Simulate, And Optimize
  Content And Behavior
Large Content And Behavior Models To Understand, Simulate, And Optimize Content And BehaviorInternational Conference on Learning Representations (ICLR), 2023
Ashmit Khandelwal
Aditya Agrawal
Aanisha Bhattacharyya
Yaman Kumar Singla
Somesh Singh
...
Ishita Dasgupta
Stefano Petrangeli
R. Shah
Changyou Chen
Balaji Krishnamurthy
271
10
0
01 Sep 2023
FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo
  Embeddings
FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo EmbeddingsInternational Conference on Information and Knowledge Management (CIKM), 2023
Yulin Su
Min Yang
Minghui Qiu
Jing Wang
Tao Wang
VLM
156
2
0
17 Aug 2023
Chat-3D: Data-efficiently Tuning Large Language Model for Universal
  Dialogue of 3D Scenes
Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes
Zehan Wang
Haifeng Huang
Yang Zhao
Ziang Zhang
Zhou Zhao
219
104
0
17 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
OctoPack: Instruction Tuning Code Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLMALM
287
176
0
14 Aug 2023
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Yang Zhao
Zhijie Lin
Daquan Zhou
Zilong Huang
Jiashi Feng
Bingyi Kang
MLLM
166
123
0
17 Jul 2023
SVIT: Scaling up Visual Instruction Tuning
SVIT: Scaling up Visual Instruction Tuning
Bo Zhao
Boya Wu
Muyang He
Tiejun Huang
MLLM
245
153
0
09 Jul 2023
Exploring and Characterizing Large Language Models For Embedded System
  Development and Debugging
Exploring and Characterizing Large Language Models For Embedded System Development and Debugging
Zachary Englhardt
Rong-Hua Li
Dilini Nissanka
Zhihan Zhang
Girish Narayanswamy
Joseph Breda
Xin Liu
Shwetak N. Patel
Vikram Iyer
189
32
0
07 Jul 2023
What Matters in Training a GPT4-Style Language Model with Multimodal
  Inputs?
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Yan Zeng
Hanbo Zhang
Jiani Zheng
Jiangnan Xia
Guoqiang Wei
Yang Wei
Yuchen Zhang
Tao Kong
MLLM
226
88
0
05 Jul 2023
A Survey on Multimodal Large Language Models
A Survey on Multimodal Large Language ModelsNational Science Review (NSR), 2023
Xinglong Mao
Chaoyou Fu
Zhengye Zhang
Ke Li
Xing Sun
Tong Xu
Enhong Chen
MLLMLRM
349
915
0
23 Jun 2023
LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning
LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning
Yunlong Tang
Jinrui Zhang
Xiangchen Wang
Teng Wang
Feng Zheng
VLM
168
10
0
17 Jun 2023
Valley: Video Assistant with Large Language model Enhanced abilitY
Valley: Video Assistant with Large Language model Enhanced abilitY
Ruipu Luo
Ziwang Zhao
Min Yang
Junwei Dong
Da Li
Pengcheng Lu
Tao Wang
Linmei Hu
Ming-Hui Qiu
MLLM
331
246
0
12 Jun 2023
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and
  Language Models
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Muhammad Maaz
H. Rasheed
Salman Khan
Fahad Shahbaz Khan
MLLM
305
913
0
08 Jun 2023
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for
  Pre-training and Benchmarks
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Haiyang Xu
Qinghao Ye
Xuan-Wei Wu
Mingshi Yan
Yuan Miao
...
Qingfang Qian
Maofei Que
Ji Zhang
Xiaoyan Zeng
Feiyan Huang
VLMMLLM
137
34
0
07 Jun 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large
  Language Models
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Amélie Reymond
LRM
272
8
0
21 May 2023
VideoChat: Chat-Centric Video Understanding
VideoChat: Chat-Centric Video Understanding
Kunchang Li
Yinan He
Yi Wang
Yizhuo Li
Wen Wang
Ping Luo
Yali Wang
Limin Wang
Yu Qiao
MLLM
303
765
0
10 May 2023
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in
  Large Language Models
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models
Jiashuo Sun
Yi Luo
Yeyun Gong
Chen Lin
Yelong Shen
Jian Guo
Nan Duan
LRM
288
27
0
23 Apr 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large
  Language Models
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLMMLLM
368
2,617
0
20 Apr 2023
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak SupervisionInternational Conference on Learning Representations (ICLR), 2023
Jiani Huang
Ziyang Li
Mayur Naik
Ser-Nam Lim
492
9
0
15 Apr 2023
Retrieving Multimodal Information for Augmented Generation: A Survey
Retrieving Multimodal Information for Augmented Generation: A SurveyConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ruochen Zhao
Hailin Chen
Weishi Wang
Fangkai Jiao
Do Xuan Long
...
Bosheng Ding
Xiaobao Guo
Minzhi Li
Xingxuan Li
Shafiq Joty
346
120
0
20 Mar 2023
Beyond Grounding: Extracting Fine-Grained Event Hierarchies Across
  Modalities
Beyond Grounding: Extracting Fine-Grained Event Hierarchies Across ModalitiesAAAI Conference on Artificial Intelligence (AAAI), 2022
Hammad A. Ayyubi
Christopher Thomas
Lovish Chum
R. Lokesh
Long Chen
...
Xudong Lin
Xuande Feng
Jaywon Koo
Sounak Ray
Shih-Fu Chang
AI4TS
201
0
0
14 Jun 2022
Explainable Artificial Intelligence for Autonomous Driving: A
  Comprehensive Overview and Field Guide for Future Research Directions
Explainable Artificial Intelligence for Autonomous Driving: A Comprehensive Overview and Field Guide for Future Research DirectionsIEEE Access (IEEE Access), 2021
Shahin Atakishiyev
Mohammad Salameh
Hengshuai Yao
Randy Goebel
457
184
0
21 Dec 2021
Previous
123...161718