Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.14991
Cited By
FoodLMM: A Versatile Food Assistant using Large Multi-modal Model
22 December 2023
Yuehao Yin
Huiyan Qi
B. Zhu
Jingjing Chen
Yu-Gang Jiang
Chong-Wah Ngo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FoodLMM: A Versatile Food Assistant using Large Multi-modal Model"
19 / 19 papers shown
Title
Advancing Food Nutrition Estimation via Visual-Ingredient Feature Fusion
Huiyan Qi
B. Zhu
Chong-Wah Ngo
Jingjing Chen
Ee-Peng Lim
26
0
0
13 May 2025
Are Vision-Language Models Ready for Dietary Assessment? Exploring the Next Frontier in AI-Powered Food Image Recognition
Sergio Romero-Tapiador
Ruben Tolosana
Blanca Lacruz-Pleguezuelos
L. Marcos-Zambrano
Guadalupe X.Bazán
Isabel Espinosa-Salinas
Julian Fierrez
Javier-Ortega Garcia
Enrique Carrillo-de Santa Pau
Aythami Morales
CoGe
26
0
0
09 Apr 2025
RecipeGen: A Benchmark for Real-World Recipe Image Generation
Ruoxuan Zhang
Hongxia Xie
Yi Yao
Jian-Yu Jiang-Lin
Bin Wen
Ling Lo
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
69
0
0
07 Mar 2025
On Domain-Specific Post-Training for Multimodal Large Language Models
Daixuan Cheng
Shaohan Huang
Ziyu Zhu
Xintong Zhang
Wayne Xin Zhao
Zhongzhi Luan
Bo Dai
Zhenliang Zhang
VLM
99
2
0
29 Nov 2024
Visual Cue Enhancement and Dual Low-Rank Adaptation for Efficient Visual Instruction Fine-Tuning
Pengkun Jiao
Bin Zhu
Jingjing Chen
Chong-Wah Ngo
Yu-Gang Jiang
VLM
OffRL
69
0
0
19 Nov 2024
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Genta Indra Winata
Frederikus Hudi
Patrick Amadeus Irawan
David Anugraha
Rifki Afina Putri
...
Alham Fikri Aji
Taro Watanabe
Derry Wijaya
Alice H. Oh
Chong-Wah Ngo
CoGe
105
9
0
16 Oct 2024
FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Yuki Imajuku
Yoko Yamakata
Kiyoharu Aizawa
31
1
0
27 Sep 2024
EAGLE: Towards Efficient Arbitrary Referring Visual Prompts Comprehension for Multimodal Large Language Models
Jiacheng Zhang
Yang Jiao
Shaoxiang Chen
Jingjing Chen
Yu-Gang Jiang
28
1
0
25 Sep 2024
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
Jiacheng Zhang
Yang Jiao
Shaoxiang Chen
Jingjing Chen
Zhiyu Tan
Hao Li
Jingjing Chen
MLLM
61
18
0
25 Sep 2024
RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models
Pengkun Jiao
Xinlan Wu
Bin Zhu
Jingjing Chen
Chong-Wah Ngo
Yu-Gang Jiang
36
9
0
17 Jul 2024
FoodSky: A Food-oriented Large Language Model that Passes the Chef and Dietetic Examination
Pengfei Zhou
Weiqing Min
Chaoran Fu
Ying Jin
Mingyu Huang
Xiangyang Li
Shuhuan Mei
Shuqiang Jiang
38
8
0
11 Jun 2024
The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni
Federico Cocchi
Luca Barsellotti
Nicholas Moratelli
Sara Sarto
Lorenzo Baraldi
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
LRM
VLM
56
41
0
19 Feb 2024
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
160
441
0
14 Oct 2023
ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge
Yunxiang Li
Zihan Li
Kai Zhang
Ruilong Dan
Steven Jiang
You Zhang
LM&MA
AI4MH
125
377
0
24 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,244
0
30 Jan 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
Zhao Yang
Jiaqi Wang
Yansong Tang
Kai-xiang Chen
Hengshuang Zhao
Philip H. S. Torr
148
306
0
04 Dec 2021
Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches
Ruoyi Du
Dongliang Chang
A. Bhunia
Jiyang Xie
Zhanyu Ma
Yi-Zhe Song
Jun Guo
76
291
0
08 Mar 2020
1