Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.16199
Cited By
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
28 March 2023
Renrui Zhang
Jiaming Han
Chris Liu
Peng Gao
Aojun Zhou
Xiangfei Hu
Shilin Yan
Pan Lu
Hongsheng Li
Yu Qiao
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention"
50 / 588 papers shown
Title
Multimodal Reasoning with Multimodal Knowledge Graph
Junlin Lee
Yequan Wang
Jing Li
Min Zhang
44
15
0
04 Jun 2024
PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Yupeng Zheng
Zebin Xing
Qichao Zhang
Bu Jin
Pengfei Li
...
Zhongpu Xia
Kun Zhan
Xianpeng Lang
Yaran Chen
Dongbin Zhao
LM&Ro
LRM
LLMAG
62
14
0
03 Jun 2024
Artemis: Towards Referential Understanding in Complex Videos
Jihao Qiu
Yuan Zhang
Xi Tang
Lingxi Xie
Tianren Ma
Pengyu Yan
David Doermann
Qixiang Ye
Yunjie Tian
VLM
VGen
44
8
0
01 Jun 2024
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Chaoyou Fu
Yuhan Dai
Yondong Luo
Lei Li
Shuhuai Ren
...
Tong Xu
Xiawu Zheng
Enhong Chen
Rongrong Ji
Xing Sun
VLM
MLLM
50
302
0
31 May 2024
InsightSee: Advancing Multi-agent Vision-Language Models for Enhanced Visual Understanding
Huaxiang Zhang
Yaojia Mu
Guo-Niu Zhu
Zhongxue Gan
43
2
0
31 May 2024
Visual Perception by Large Language Model's Weights
Feipeng Ma
Hongwei Xue
Guangting Wang
Yizhou Zhou
Fengyun Rao
Shilin Yan
Yueyi Zhang
Siying Wu
Mike Zheng Shou
Xiaoyan Sun
VLM
25
5
0
30 May 2024
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models
Yutao Zhu
Zhaoheng Huang
Zhicheng Dou
Ji-Rong Wen
RALM
56
5
0
30 May 2024
Why Larger Language Models Do In-context Learning Differently?
Zhenmei Shi
Junyi Wei
Zhuoyan Xu
Yingyu Liang
37
18
0
30 May 2024
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Shenghuan Sun
Gregory M. Goldgof
Alexander Schubert
Zhiqing Sun
Thomas Hartvigsen
A. Butte
Ahmed Alaa
LM&MA
42
4
0
29 May 2024
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
Yixiao Zhang
Yukara Ikemiya
Woosung Choi
Naoki Murata
Marco A. Martínez-Ramírez
Liwei Lin
Gus Xia
Wei-Hsiang Liao
Yuki Mitsufuji
Simon Dixon
57
10
0
28 May 2024
The Evolution of Multimodal Model Architectures
S. Wadekar
Abhishek Chaurasia
Aman Chadha
Eugenio Culurciello
43
14
0
28 May 2024
FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic Prediction
Zhonghang Li
Lianghao Xia
Yong-mei Xu
Chao Huang
OOD
AI4TS
50
11
0
28 May 2024
Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model
Haogeng Liu
Quanzeng You
Xiaotian Han
Yongfei Liu
Huaibo Huang
Ran He
Hongxia Yang
33
2
0
28 May 2024
LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design
Rui Kong
Qiyang Li
Xinyu Fang
Qingtian Feng
Qingfeng He
Yazhu Dong
Weijun Wang
Yuanchun Li
Linghe Kong
Yunxin Liu
MoE
38
4
0
28 May 2024
Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification
Weizhen He
Yiheng Deng
Yunfeng Yan
Feng Zhu
Yizhou Wang
Lei Bai
Qingsong Xie
Donglian Qi
Wanli Ouyang
Shixiang Tang
95
2
0
28 May 2024
Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation
Ya Lu
Jishnu Jaykumar
Yunhui Guo
Nicholas Ruozzi
Yu Xiang
VLM
ISeg
58
4
0
28 May 2024
Matryoshka Multimodal Models
Mu Cai
Jianwei Yang
Jianfeng Gao
Yong Jae Lee
VLM
50
25
0
27 May 2024
Hawk: Learning to Understand Open-World Video Anomalies
Jiaqi Tang
Hao Lu
Ruizheng Wu
Xiaogang Xu
Ke Ma
Cheng Fang
Bin Guo
Jiangbo Lu
Qifeng Chen
Ying-Cong Chen
VLM
40
9
0
27 May 2024
Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs
Mustafa Shukor
Matthieu Cord
68
5
0
26 May 2024
M
3
^3
3
CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought
Qiguang Chen
Libo Qin
Jin Zhang
Zhi Chen
Xiao Xu
Wanxiang Che
LRM
37
35
0
26 May 2024
OmniBind: Teach to Build Unequal-Scale Modality Interaction for Omni-Bind of All
Yuanhuiyi Lyu
Xueye Zheng
Dahun Kim
Lin Wang
51
13
0
25 May 2024
Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models
Yue Zhang
Hehe Fan
Yi Yang
53
3
0
24 May 2024
Agentic Skill Discovery
Xufeng Zhao
C. Weber
Stefan Wermter
LM&Ro
LLMAG
LRM
34
2
0
23 May 2024
Unveiling the Tapestry of Consistency in Large Vision-Language Models
Yuan Zhang
Fei Xiao
Tao Huang
Chun-Kai Fan
Hongyuan Dong
Jiawen Li
Jiacong Wang
Kuan Cheng
Shanghang Zhang
Haoyuan Guo
40
7
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
42
0
23 May 2024
Dense Connector for MLLMs
Huanjin Yao
Wenhao Wu
Taojiannan Yang
Yuxin Song
Mengxi Zhang
Haocheng Feng
Yifan Sun
Zhiheng Li
Wanli Ouyang
Jingdong Wang
MLLM
VLM
42
16
0
22 May 2024
DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying
Guanghui Wang
Dexi Liu
Jian-Yun Nie
Qizhi Wan
Rong Hu
Xiping Liu
Wanlong Liu
Jiaming Liu
95
0
0
22 May 2024
Exploring Ordinality in Text Classification: A Comparative Study of Explicit and Implicit Techniques
Siva Rajesh Kasa
Aniket Goel
Karan Gupta
Sumegh Roychowdhury
Anish Bhanushali
Nikhil Pattisapu
Prasanna Srinivasa Murthy
41
1
0
20 May 2024
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
E. Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Caccia
Alessandro Sordoni
MoMe
44
31
0
18 May 2024
Libra: Building Decoupled Vision System on Large Language Models
Yifan Xu
Xiaoshan Yang
Y. Song
Changsheng Xu
MLLM
VLM
43
6
0
16 May 2024
Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models
Yuchen Hu
Chen Chen
Chengwei Qin
Qiushi Zhu
E. Chng
Ruizhe Li
AuLLM
KELM
49
5
0
16 May 2024
Enhancing Semantics in Multimodal Chain of Thought via Soft Negative Sampling
Guangmin Zheng
Jin Wang
Xiaobing Zhou
Xuejie Zhang
LRM
38
2
0
16 May 2024
Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring
Tiantian Zhang
Manxi Lin
Hongda Guo
Xiaofan Zhang
Ka Fung Peter Chiu
Aasa Feragen
Qi Dou
37
1
0
14 May 2024
FreeVA: Offline MLLM as Training-Free Video Assistant
Wenhao Wu
VLM
OffRL
40
19
0
13 May 2024
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Peng Gao
Le Zhuo
Ziyi Lin
Ruoyi Du
Xu Luo
...
Weicai Ye
He Tong
Jingwen He
Yu Qiao
Hongsheng Li
VGen
37
83
0
09 May 2024
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
Shibo Jie
Yehui Tang
Ning Ding
Zhi-Hong Deng
Kai Han
Yunhe Wang
VLM
33
6
0
09 May 2024
Sign2GPT: Leveraging Large Language Models for Gloss-Free Sign Language Translation
Ryan Wong
Necati Cihan Camgöz
Richard Bowden
SLR
51
21
0
07 May 2024
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Jing Xu
Jingzhao Zhang
39
7
0
04 May 2024
LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model
Yulin Luo
Ruichuan An
Bocheng Zou
Yiming Tang
Jiaming Liu
Shanghang Zhang
41
13
0
03 May 2024
A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model
Weiqi Zhang
Jiexia Ye
Ke Yi
Yongzi Yu
Ziyue Li
Jia Li
Fugee Tsung
AI4TS
AI4CE
45
22
0
03 May 2024
SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models
Burak Can Biner
Farrin Marouf Sofian
Umur Berkay Karakacs
Duygu Ceylan
Erkut Erdem
Aykut Erdem
23
7
0
01 May 2024
LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing
Zeyang Ma
A. Chen
Dong Jae Kim
Tse-Husn Chen
Shaowei Wang
27
45
0
27 Apr 2024
Instance-free Text to Point Cloud Localization with Relative Position Awareness
Lichao Wang
Zhihao Yuan
Jinke Ren
Shuguang Cui
Zhen Li
44
0
0
27 Apr 2024
MovieChat+: Question-aware Sparse Memory for Long Video Question Answering
Enxin Song
Wenhao Chai
Tianbo Ye
Jenq-Neng Hwang
Xi Li
Gaoang Wang
VLM
MLLM
37
30
0
26 Apr 2024
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning
Lin Xu
Yilin Zhao
Daquan Zhou
Zhijie Lin
See Kiong Ng
Jiashi Feng
MLLM
VLM
38
159
0
25 Apr 2024
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Zhe Chen
Weiyun Wang
Hao Tian
Shenglong Ye
Zhangwei Gao
...
Tong Lu
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
MLLM
VLM
49
533
0
25 Apr 2024
Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models
Jiawei Chen
Dingkang Yang
Yue Jiang
Mingcheng Li
Jinjie Wei
Xiaolu Hou
Lihua Zhang
56
6
0
25 Apr 2024
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Timin Gao
Peixian Chen
Mengdan Zhang
Chaoyou Fu
Yunhang Shen
...
Shengchuan Zhang
Xiawu Zheng
Xing Sun
Liujuan Cao
Rongrong Ji
MLLM
LRM
49
16
0
24 Apr 2024
DesignProbe: A Graphic Design Benchmark for Multimodal Large Language Models
Jieru Lin
Danqing Huang
Tiejun Zhao
Dechen Zhan
Chin-Yew Lin
VLM
MLLM
35
3
0
23 Apr 2024
Pegasus-v1 Technical Report
Raehyuk Jung
Hyojun Go
Jaehyuk Yi
Jiho Jang
Daniel Kim
...
Maninder Saini
Meredith Sanders
Soyoung Lee
Sue Kim
Travis Couture
MLLM
VLM
29
5
0
23 Apr 2024
Previous
1
2
3
4
5
6
...
10
11
12
Next