Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.16635
Cited By
TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning
25 April 2024
Liang Zhang
Anwen Hu
Haiyang Xu
Mingshi Yan
Yichen Xu
Qin Jin
Ji Zhang
Fei Huang
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2177★)
Papers citing
"TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning"
19 / 19 papers shown
Title
Protecting multimodal large language models against misleading visualizations
Jonathan Tonglet
Tinne Tuytelaars
Marie-Francine Moens
Iryna Gurevych
114
2
0
27 Feb 2025
EvoChart: A Benchmark and a Self-Training Approach Towards Real-World Chart Understanding
Muye Huang
Han Lai
Xinyu Zhang
Wenjun Wu
Jie Ma
Lingling Zhang
Jun Liu
74
8
0
03 Sep 2024
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Xingcheng Zhang
Jifeng Dai
Yuxin Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
95
127
0
09 Apr 2024
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning
Renqiu Xia
Bo Zhang
Hancheng Ye
Xiangchao Yan
Qi Liu
...
Min Dou
Botian Shi
Junchi Yan
Junchi Yan
Yu Qiao
LRM
127
68
0
19 Feb 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Conghui He
Xingcheng Zhang
Yu Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
148
267
0
29 Jan 2024
ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning
Fanqing Meng
Wenqi Shao
Quanfeng Lu
Peng Gao
Kaipeng Zhang
Yu Qiao
Ping Luo
65
55
0
04 Jan 2024
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Qidong Huang
Xiao-wen Dong
Pan Zhang
Bin Wang
Conghui He
Jiaqi Wang
Dahua Lin
Weiming Zhang
Neng H. Yu
MLLM
126
206
0
29 Nov 2023
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Pan Zhang
Xiaoyi Wang
Bin Wang
Yuhang Cao
Chao Xu
...
Conghui He
Xingcheng Zhang
Yu Qiao
Da Lin
Jiaqi Wang
MLLM
146
241
0
26 Sep 2023
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Ahmed Masry
P. Kavehzadeh
Do Xuan Long
Enamul Hoque
Shafiq Joty
LRM
61
111
0
24 May 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
288
956
0
27 Apr 2023
MPMQA: Multimodal Question Answering on Product Manuals
Liangfu Zhang
Anwen Hu
Jing Zhang
Shuo Hu
Qin Jin
74
10
0
19 Apr 2023
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
251
1,200
0
27 Mar 2023
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
121
470
0
17 Oct 2022
ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning
Ahmed Masry
Do Xuan Long
J. Tan
Shafiq Joty
Enamul Hoque
AIMat
134
685
0
19 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
886
13,207
0
04 Mar 2022
Question-controlled Text-aware Image Captioning
Anwen Hu
Shizhe Chen
Qin Jin
56
15
0
04 Aug 2021
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
111
1,255
0
18 Apr 2019
Object Hallucination in Image Captioning
Anna Rohrbach
Lisa Anne Hendricks
Kaylee Burns
Trevor Darrell
Kate Saenko
194
443
0
06 Sep 2018
DVQA: Understanding Data Visualizations via Question Answering
Kushal Kafle
Brian L. Price
Scott D. Cohen
Christopher Kanan
AIMat
85
397
0
24 Jan 2018
1