ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.06687
  4. Cited By
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset,
  Framework, and Benchmark
v1v2v3 (latest)

LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark

11 June 2023
Zhen-fei Yin
Jiong Wang
Jianjian Cao
Zhelun Shi
Dingning Liu
Mukai Li
Lu Sheng
Lei Bai
Xiaoshui Huang
Zhiyong Wang
Jing Shao
Wanli Ouyang
    MLLM
ArXiv (abs)PDFHTML

Papers citing "LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark"

29 / 29 papers shown
Title
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation
Zhiyang Xu
Jiuhai Chen
Zhaojiang Lin
Xichen Pan
Lifu Huang
...
Di Jin
Michihiro Yasunaga
Lili Yu
Xi Lin
Shaoliang Nie
121
1
0
12 Jun 2025
Revolutionizing Clinical Trials: A Manifesto for AI-Driven Transformation
Revolutionizing Clinical Trials: A Manifesto for AI-Driven Transformation
M. Schaar
Richard W. Peck
E. McKinney
Jim Weatherall
Stuart Bailey
...
Rafik Salama
Christina Gunther
Francesca Frau
Antoine Pugeat
Ramon Hernandez
MedIm
69
6
0
10 Jun 2025
CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAG
CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAG
Yang Tian
Fan Liu
Jingyuan Zhang
Victoria A. Webster-Wood
Yupeng Hu
Liqiang Nie
VLM
64
0
0
03 Jun 2025
Contra4: Evaluating Contrastive Cross-Modal Reasoning in Audio, Video, Image, and 3D
Contra4: Evaluating Contrastive Cross-Modal Reasoning in Audio, Video, Image, and 3D
Artemis Panagopoulou
Le Xue
Honglu Zhou
Silvio Savarese
Ran Xu
Caiming Xiong
Chris Callison-Burch
Mark Yatskar
Juan Carlos Niebles
52
0
0
02 Jun 2025
HueManity: Probing Fine-Grained Visual Perception in MLLMs
HueManity: Probing Fine-Grained Visual Perception in MLLMs
Rynaa Grover
Jayant Sravan Tamarapalli
Sahiti Yerramilli
Nilay Pande
VLM
23
0
0
31 May 2025
PixelThink: Towards Efficient Chain-of-Pixel Reasoning
PixelThink: Towards Efficient Chain-of-Pixel Reasoning
Song Wang
Gongfan Fang
Lingdong Kong
Xiangtai Li
Jianyun Xu
Sheng Yang
Qiang Li
Jianke Zhu
Xinchao Wang
LRM
121
0
0
29 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
315
1
0
05 May 2025
Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
Bardia Safaei
Faizan Siddiqui
Jiacong Xu
Vishal M. Patel
Shao-Yuan Lo
VLM
478
1
0
10 Mar 2025
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning
Hanxun Yu
Wentong Li
Song Wang
Jintai Chen
Jianke Zhu
3DVLRM
159
6
0
01 Mar 2025
A Survey on Class-Agnostic Counting: Advancements from Reference-Based to Open-World Text-Guided Approaches
A Survey on Class-Agnostic Counting: Advancements from Reference-Based to Open-World Text-Guided Approaches
Luca Ciampi
Ali Azmoudeh
Elif Ecem Akbaba
Erdi Sarıtaş
Ziya Ata Yazıcı
H. K. Ekenel
Giuseppe Amato
Fabrizio Falchi
185
0
0
31 Jan 2025
Social-LLaVA: Enhancing Robot Navigation through Human-Language Reasoning in Social Spaces
Social-LLaVA: Enhancing Robot Navigation through Human-Language Reasoning in Social Spaces
Amirreza Payandeh
Daeun Song
Mohammad Nazeri
Jing Liang
Praneel Mukherjee
Amir Hossain Raj
Yangzhe Kong
Dinesh Manocha
Xuesu Xiao
LM&RoLRM
220
5
0
17 Jan 2025
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Lu Qiu
Yuying Ge
Yi Chen
Yixiao Ge
Ying Shan
Xihui Liu
LLMAGLRM
211
8
0
05 Dec 2024
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?
Han Bao
Yue Huang
Yanbo Wang
Jiayi Ye
Xiangqi Wang
Preslav Nakov
Mohamed Elhoseiny
Wei Wei
Mohamed Elhoseiny
Xiangliang Zhang
109
11
0
28 Oct 2024
An Intelligent Agentic System for Complex Image Restoration Problems
An Intelligent Agentic System for Complex Image Restoration Problems
Kaiwen Zhu
Jinjin Gu
Zhiyuan You
Yu Qiao
Chao Dong
135
10
0
23 Oct 2024
MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems
MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems
Zifeng Zhu
Mengzhao Jia
Zizhuo Zhang
Lang Li
Meng Jiang
LRM
137
5
0
18 Oct 2024
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models
Xin Zou
Yizhou Wang
Yibo Yan
Yuanhuiyi Lyu
Kening Zheng
...
Junkai Chen
Peijie Jiang
Qingbin Liu
Chang Tang
Xuming Hu
165
8
0
04 Oct 2024
Reflective Instruction Tuning: Mitigating Hallucinations in Large
  Vision-Language Models
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang
Teng Wang
Haigang Zhang
Ping Lu
Feng Zheng
MLLMLRMVLM
90
4
0
16 Jul 2024
Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
Zhiyang Xu
Minqian Liu
Ying Shen
Joy Rimchala
Jiaxin Zhang
Qifan Wang
Yu Cheng
Lifu Huang
VLM
90
6
0
04 Jul 2024
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models
  via Counterfactual Probing
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing
Yisong Xiao
Aishan Liu
QianJia Cheng
Zhenfei Yin
Siyuan Liang
Jiapeng Li
Jing Shao
Xianglong Liu
Dacheng Tao
124
8
0
30 Jun 2024
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
Shengkang Wang
Hongzhan Lin
Ziyang Luo
Zhen Ye
Guang Chen
Jing Ma
167
4
0
17 Jun 2024
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
Yongting Zhang
Lu Chen
Guodong Zheng
Yifeng Gao
Rui Zheng
...
Yu Qiao
Xuanjing Huang
Feng Zhao
Tao Gui
Jing Shao
VLM
228
33
0
17 Jun 2024
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents
Zeren Chen
Zhelun Shi
Xiaoya Lu
Lehan He
Sucheng Qian
...
Zhen-fei Yin
Jing Shao
Jing Shao
Cewu Lu
Cewu Lu
75
6
0
28 Mar 2024
The Revolution of Multimodal Large Language Models: A Survey
The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni
Federico Cocchi
Luca Barsellotti
Nicholas Moratelli
Sara Sarto
Lorenzo Baraldi
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
LRMVLM
135
64
0
19 Feb 2024
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model
  Reasoning over Image Sequences
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences
Xiyao Wang
Yuhang Zhou
Xiaoyu Liu
Hongjin Lu
Yuancheng Xu
...
Taixi Lu
Gedas Bertasius
Mohit Bansal
Huaxiu Yao
Furong Huang
LRMVLM
166
78
0
19 Jan 2024
GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse
GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse
Hongzhan Lin
Ziyang Luo
Bo Wang
Ruichao Yang
Jing Ma
116
31
0
03 Jan 2024
ReForm-Eval: Evaluating Large Vision Language Models via Unified
  Re-Formulation of Task-Oriented Benchmarks
ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks
Zejun Li
Ye Wang
Mengfei Du
Qingwen Liu
Binhao Wu
...
Zhihao Fan
Jie Fu
Jingjing Chen
Xuanjing Huang
Zhongyu Wei
118
15
0
04 Oct 2023
Beyond Task Performance: Evaluating and Reducing the Flaws of Large
  Multimodal Models with In-Context Learning
Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning
Mustafa Shukor
Alexandre Ramé
Corentin Dancette
Matthieu Cord
LRMMLLM
113
22
0
01 Oct 2023
PUMGPT: A Large Vision-Language Model for Product Understanding
PUMGPT: A Large Vision-Language Model for Product Understanding
Wei Xue
Zongyi Guo
Baoliang Cui
Zengming Tang
Weiwei Zhang
Haihong Tang
Shuhui Wu
Weiming Lu
VLM
72
2
0
18 Aug 2023
EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder
EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder
Xiaoshui Huang
Zhou Huang
Shengjia Li
Wentao Qu
Tong He
Yuenan Hou
Yifan Zuo
Wanli Ouyang
104
13
0
08 Dec 2022
1