ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1603.07396
  4. Cited By
A Diagram Is Worth A Dozen Images

A Diagram Is Worth A Dozen Images

24 March 2016
Aniruddha Kembhavi
M. Salvato
Eric Kolve
Minjoon Seo
Hannaneh Hajishirzi
Ali Farhadi
    3DV
ArXivPDFHTML

Papers citing "A Diagram Is Worth A Dozen Images"

22 / 72 papers shown
Title
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
Hongyu Wang
Jiayu Xu
Senwei Xie
Ruiping Wang
Jialin Li
Zhaojie Xie
Bin Zhang
Chuyan Xiong
Xilin Chen
ELM
VLM
LRM
123
6
0
24 May 2024
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
Wei Huang
Xingyu Zheng
Xudong Ma
Haotong Qin
Chengtao Lv
Hong Chen
Jie Luo
Xiaojuan Qi
Xianglong Liu
Michele Magno
MQ
89
42
0
22 Apr 2024
TextSquare: Scaling up Text-Centric Visual Instruction Tuning
TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Jingqun Tang
Chunhui Lin
Zhen Zhao
Shubo Wei
Binghong Wu
...
Yuliang Liu
Xiang Bai
Can Huang
Xiang Bai
Can Huang
LRM
VLM
MLLM
146
30
0
19 Apr 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
179
113
0
08 Feb 2024
Quilt-1M: One Million Image-Text Pairs for Histopathology
Quilt-1M: One Million Image-Text Pairs for Histopathology
Wisdom O. Ikezogwo
M. S. Seyfioglu
Fatemeh Ghezloo
Dylan Stefan Chan Geva
Fatwir Sheikh Mohammed
Pavan Kumar Anand
Ranjay Krishna
Linda G. Shapiro
CLIP
VLM
283
125
0
20 Jun 2023
Question Answering via Integer Programming over Semi-Structured
  Knowledge
Question Answering via Integer Programming over Semi-Structured Knowledge
Daniel Khashabi
Tushar Khot
Ashish Sabharwal
Peter Clark
Oren Etzioni
Dan Roth
63
102
0
20 Apr 2016
Learning to Compose Neural Networks for Question Answering
Learning to Compose Neural Networks for Question Answering
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
NAI
KELM
BDL
CoGe
99
568
0
07 Jan 2016
Where To Look: Focus Regions for Visual Question Answering
Where To Look: Focus Regions for Visual Question Answering
Kevin J. Shih
Saurabh Singh
Derek Hoiem
73
460
0
23 Nov 2015
Image Question Answering using Convolutional Neural Network with Dynamic
  Parameter Prediction
Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction
Hyeonwoo Noh
Paul Hongsuck Seo
Bohyung Han
OOD
78
327
0
18 Nov 2015
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for
  Visual Question Answering
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
Huijuan Xu
Kate Saenko
74
763
0
17 Nov 2015
Yin and Yang: Balancing and Answering Binary Visual Questions
Yin and Yang: Balancing and Answering Binary Visual Questions
Peng Zhang
Yash Goyal
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
87
352
0
16 Nov 2015
Visual7W: Grounded Question Answering in Images
Visual7W: Grounded Question Answering in Images
Yuke Zhu
Oliver Groth
Michael S. Bernstein
Li Fei-Fei
91
884
0
11 Nov 2015
Stacked Attention Networks for Image Question Answering
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
107
1,882
0
07 Nov 2015
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
A. Kumar
Ozan Irsoy
Peter Ondruska
Mohit Iyyer
James Bradbury
Ishaan Gulrajani
Victor Zhong
Romain Paulus
R. Socher
115
1,180
0
24 Jun 2015
Teaching Machines to Read and Comprehend
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
347
3,547
0
10 Jun 2015
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
702
36,958
0
08 Jun 2015
Are You Talking to a Machine? Dataset and Methods for Multilingual Image
  Question Answering
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
Haoyuan Gao
Junhua Mao
Jie Zhou
Zhiheng Huang
Lei Wang
Wenyuan Xu
78
501
0
21 May 2015
Exploring Models and Data for Image Question Answering
Exploring Models and Data for Image Question Answering
Mengye Ren
Ryan Kiros
R. Zemel
80
715
0
08 May 2015
VQA: Visual Question Answering
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
202
5,478
0
03 May 2015
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
Jason Weston
Antoine Bordes
S. Chopra
Alexander M. Rush
Bart van Merriënboer
Armand Joulin
Tomas Mikolov
LRM
ELM
148
1,181
0
19 Feb 2015
A Multi-World Approach to Question Answering about Real-World Scenes
  based on Uncertain Input
A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input
Mateusz Malinowski
Mario Fritz
199
697
0
01 Oct 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.6K
100,386
0
04 Sep 2014
Previous
12