Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.16199
Cited By
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
28 March 2023
Renrui Zhang
Jiaming Han
Chris Liu
Peng Gao
Aojun Zhou
Xiangfei Hu
Shilin Yan
Pan Lu
Hongsheng Li
Yu Qiao
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention"
50 / 586 papers shown
Title
SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization
Tan-Hanh Pham
Hoang-Nam Le
Phu-Vinh Nguyen
Chris Ngo
Truong Son-Hy
AuLLM
LRM
81
1
0
21 Dec 2024
A High-Quality Text-Rich Image Instruction Tuning Dataset via Hybrid Instruction Generation
Shijie Zhou
R. Zhang
Yufan Zhou
Changyou Chen
VLM
77
1
0
20 Dec 2024
MedCoT: Medical Chain of Thought via Hierarchical Expert
Jiaxiang Liu
Yuan Wang
Jiawei Du
Qiufeng Wang
Zuozhu Liu
LRM
84
9
0
18 Dec 2024
Efficient Fine-Tuning of Single-Cell Foundation Models Enables Zero-Shot Molecular Perturbation Prediction
Sepideh Maleki
Jan-Christian Huetter
Kangway V Chuang
Gabriele Scalia
Tommaso Biancalani
Tommaso Biancalani
AI4CE
90
2
0
18 Dec 2024
Dynamic-VLM: Simple Dynamic Visual Token Compression for VideoLLM
Haozhao Wang
Yuxiang Nie
Yongjie Ye
Deng GuanYu
Yanjie Wang
Shuai Li
Haiyang Yu
Jinghui Lu
Can Huang
VLM
MLLM
82
1
0
12 Dec 2024
Pinco: Position-induced Consistent Adapter for Diffusion Transformer in Foreground-conditioned Inpainting
Guangben Lu
Yuzhen Du
Zhimin Sun
Ran Yi
Yifan Qi
Yizhe Tang
Tianyi Wang
Lizhuang Ma
Fangyuan Zou
DiffM
80
1
0
05 Dec 2024
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Lu Qiu
Yuying Ge
Yi Chen
Yixiao Ge
Ying Shan
Xihui Liu
LLMAG
LRM
98
5
0
05 Dec 2024
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video
Jinyuan Qu
Hongyang Li
Shilong Liu
Tianhe Ren
Zhaoyang Zeng
Lei Zhang
3DPC
72
1
0
27 Nov 2024
Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation
Seokil Ham
H. Kim
Sangmin Woo
Changick Kim
Mamba
186
0
0
21 Nov 2024
MMGenBench: Fully Automatically Evaluating LMMs from the Text-to-Image Generation Perspective
Hailang Huang
Yong Wang
Zixuan Huang
Huaqiu Li
Tongwen Huang
Xiangxiang Chu
Richong Zhang
MLLM
LM&MA
EGVM
85
0
0
21 Nov 2024
IterIS: Iterative Inference-Solving Alignment for LoRA Merging
Hongxu Chen
Runshi Li
Bowei Zhu
Zhen Wang
Long Chen
MoMe
98
0
0
21 Nov 2024
MpoxVLM: A Vision-Language Model for Diagnosing Skin Lesions from Mpox Virus Infection
Xu Cao
Wenqian Ye
K. Moise
Megan Coffee
36
2
0
16 Nov 2024
Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination
Haojie Zheng
Tianyang Xu
Hanchi Sun
Shu Pu
Ruoxi Chen
Lichao Sun
MLLM
LRM
84
8
0
15 Nov 2024
MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models
Jianhong Tu
Zhuohao Ni
Nicholas Crispino
Zihao Yu
Michael Bendersky
...
Ruoxi Jia
Xin Liu
Lingjuan Lyu
Dawn Song
Chenguang Wang
VLM
MLLM
51
0
0
15 Nov 2024
NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied Vision-and-Language Navigation
Youzhi Liu
Fanglong Yao
Yuanchang Yue
Guangluan Xu
Xian Sun
Kun Fu
LM&Ro
37
3
0
13 Nov 2024
Membership Inference Attacks against Large Vision-Language Models
Zhan Li
Yongtao Wu
Yihang Chen
F. Tonin
Elias Abad Rocamora
V. Cevher
44
4
0
05 Nov 2024
Foundations and Recent Trends in Multimodal Mobile Agents: A Survey
Biao Wu
Yanda Li
Meng Fang
Zirui Song
Zhiwei Zhang
Yunchao Wei
L. Chen
LM&Ro
LLMAG
OffRL
AI4TS
44
4
0
04 Nov 2024
Pin-Tuning: Parameter-Efficient In-Context Tuning for Few-Shot Molecular Property Prediction
Liang Wang
Qiang Liu
Shaozhen Liu
Xin Sun
Shu Wu
Liang Wang
39
2
0
02 Nov 2024
SV-RAG: LoRA-Contextualizing Adaptation of MLLMs for Long Document Understanding
Jian Chen
R. Zhang
Yufan Zhou
Tong Yu
Franck Dernoncourt
J. Gu
Ryan Rossi
Changyou Chen
Tong Sun
39
0
0
02 Nov 2024
LLaMo: Large Language Model-based Molecular Graph Assistant
Jinyoung Park
Minseong Bae
Dohwan Ko
Hyunwoo J. Kim
39
1
0
31 Oct 2024
Improving Generalization in Visual Reasoning via Self-Ensemble
Tien-Huy Nguyen
Quang-Khai Tran
Anh-Tuan Quang-Hoang
VLM
LRM
55
5
0
28 Oct 2024
FLAASH: Flow-Attention Adaptive Semantic Hierarchical Fusion for Multi-Modal Tobacco Content Analysis
N. V. R. Chappa
P. Dobbs
Bhiksha Raj
Khoa Luu
34
3
0
25 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
L. Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
49
3
0
24 Oct 2024
ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning
Zhiwei Hao
Jianyuan Guo
Li Shen
Yong Luo
Han Hu
Yonggang Wen
VLM
26
0
0
23 Oct 2024
CogSteer: Cognition-Inspired Selective Layer Intervention for Efficient Semantic Steering in Large Language Models
Xintong Wang
Jingheng Pan
Longqin Jiang
Liang Ding
Xingshan Li
Chris Biemann
LLMSV
29
0
0
23 Oct 2024
PETAH: Parameter Efficient Task Adaptation for Hybrid Transformers in a resource-limited Context
Maximilian Augustin
Syed Shakib Sarwar
Mostafa Elhoushi
Sai Qian Zhang
Yuecheng Li
B. D. Salvo
25
0
0
23 Oct 2024
AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Kim Sung-Bin
Oh Hyun-Bin
JungMok Lee
Arda Senocak
Joon Son Chung
Tae-Hyun Oh
MLLM
VLM
46
3
0
23 Oct 2024
Order Matters: Exploring Order Sensitivity in Multimodal Large Language Models
Zhijie Tan
Xu Chu
Weiping Li
Tong Mo
31
1
0
22 Oct 2024
Opportunities and Challenges of Generative-AI in Finance
Akshar Prabhu Desai
Ganesh Satish Mallya
Mohammad Luqman
Tejasvi Ravi
Nithya Kota
Pranjul Yadav
AIFin
39
2
0
21 Oct 2024
LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound
Xuechen Guo
Wenhao Chai
Shi-Yan Li
Gaoang Wang
33
6
0
19 Oct 2024
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step
Mingyuan Zhou
Huangjie Zheng
Yi Gu
Zhendong Wang
Hai Huang
DiffM
52
4
0
19 Oct 2024
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation
Rongyao Fang
Chengqi Duan
Kun Wang
Hao Li
H. Tian
Xingyu Zeng
Rui Zhao
Jifeng Dai
Hongsheng Li
Xihui Liu
MLLM
36
11
0
17 Oct 2024
LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning
Yiming Shi
Jiwei Wei
Yujia Wu
Ran Ran
Chengwei Sun
Shiyuan He
Yang Yang
ALM
43
1
0
17 Oct 2024
RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models
Haoran Hao
Jiaming Han
Changsheng Li
Yu-Feng Li
Xiangyu Yue
RALM
56
1
0
17 Oct 2024
VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI
Sijie Cheng
Kechen Fang
Yangyang Yu
Sicheng Zhou
Yangqiu Song
Ye Tian
Tingguang Li
Lei Han
Yang Liu
51
8
0
15 Oct 2024
PAVLM: Advancing Point Cloud based Affordance Understanding Via Vision-Language Model
Shang-Ching Liu
Van-Nhiem Tran
Wenkai Chen
Wei-Lun Cheng
Yen-Lin Huang
I-Bin Liao
Yung-Hui Li
Jianwei Zhang
20
0
0
15 Oct 2024
VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models
Xiaohan Lan
Yitian Yuan
Zequn Jie
Lin Ma
VLM
26
2
0
15 Oct 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
96
19
0
15 Oct 2024
Free Video-LLM: Prompt-guided Visual Perception for Efficient Training-free Video LLMs
Kai Han
Jianyuan Guo
Yehui Tang
W. He
Enhua Wu
Yunhe Wang
MLLM
VLM
21
3
0
14 Oct 2024
Skipping Computations in Multimodal LLMs
Mustafa Shukor
Matthieu Cord
26
2
0
12 Oct 2024
Treat Visual Tokens as Text? But Your MLLM Only Needs Fewer Efforts to See
Phu Pham
Phu Pham
Kun Wan
Yu-Jhe Li
Zeliang Zhang
Daniel Miranda
Ajinkya Kale
Ajinkya Kale
Chenliang Xu
29
5
0
08 Oct 2024
Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts
Zhiwei Lin
Yongtao Wang
Zhi Tang
ObjD
VLM
30
2
0
08 Oct 2024
Superficial Safety Alignment Hypothesis
Jianwei Li
Jung-Eun Kim
24
1
0
07 Oct 2024
Polymath: A Challenging Multi-modal Mathematical Reasoning Benchmark
Himanshu Gupta
Shreyas Verma
Ujjwala Anantheswaran
Kevin Scaria
Mihir Parmar
Swaroop Mishra
Chitta Baral
ReLM
LRM
32
5
0
06 Oct 2024
Frame-Voyager: Learning to Query Frames for Video Large Language Models
Sicheng Yu
Chengkai Jin
Huanyu Wang
Zhenghao Chen
Sheng Jin
...
Zhenbang Sun
Bingni Zhang
Jiawei Wu
Hao Zhang
Qianru Sun
69
5
0
04 Oct 2024
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Wenhao Chai
Enxin Song
Y. Du
Chenlin Meng
Vashisht Madhavan
Omer Bar-Tal
Jeng-Neng Hwang
Saining Xie
Christopher D. Manning
3DV
84
25
0
04 Oct 2024
LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model
Duy M. H. Nguyen
N. T. Diep
Trung Q. Nguyen
Hoang-Bao Le
Tai Nguyen
...
Pengtao Xie
Roger Wattenhofer
James Zhou
Daniel Sonntag
Mathias Niepert
VLM
55
3
0
03 Oct 2024
Semantic Communication and Control Co-Design for Multi-Objective Correlated Dynamics
Abanoub M. Girgis
Hyowoon Seo
Mehdi Bennis
27
0
0
03 Oct 2024
EMMA: Efficient Visual Alignment in Multi-Modal LLMs
Sara Ghazanfari
Alexandre Araujo
Prashanth Krishnamurthy
Siddharth Garg
Farshad Khorrami
VLM
54
1
0
02 Oct 2024
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models
Zefang Liu
Yinzhu Quan
27
0
0
02 Oct 2024
Previous
1
2
3
4
5
...
10
11
12
Next