Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.10215
Cited By
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
24 February 2020
Xinyu Wang
Yuliang Liu
Chunhua Shen
Chun Chet Ng
Canjie Luo
Lianwen Jin
C. Chan
Anton Van Den Hengel
Liangwei Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering"
25 / 25 papers shown
Title
SATORI-R1: Incentivizing Multimodal Reasoning with Spatial Grounding and Verifiable Rewards
Chuming Shen
Wei Wei
Xiaoye Qu
Yu Cheng
LRM
97
0
0
25 May 2025
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Yan Ma
Linge Du
Xuyang Shen
Shaoxiang Chen
Pengfei Li
Qibing Ren
Lizhuang Ma
Yuchao Dai
Pengfei Liu
Junjie Yan
OffRL
LRM
45
0
0
23 May 2025
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao
Shiqian Su
X. Zhu
Chenyu Zhang
Zhe Chen
...
Wenhai Wang
Lewei Lu
Gao Huang
Yu Qiao
Jifeng Dai
MLLM
VLM
156
2
0
20 Dec 2024
Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning
Yipeng Sun
Jiaming Liu
Wei Liu
Junyu Han
Errui Ding
Jingtuo Liu
61
51
0
17 Sep 2019
ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition -- RRC-MLT-2019
Nibal Nayef
Yash J. Patel
M. Busta
Pinaki Nath Chowdhury
Dimosthenis Karatzas
...
Jirí Matas
Umapada Pal
J. Burie
Cheng-Lin Liu
J. Ogier
3DV
47
244
0
01 Jul 2019
Omnidirectional Scene Text Detection with Sequential-free Box Discretization
Yuliang Liu
Sheng Zhang
Lianwen Jin
Lele Xie
Y. Wu
Zhepeng Wang
36
90
0
06 Jun 2019
Scene Text Visual Question Answering
Ali Furkan Biten
Rubèn Pérez Tito
Andrés Mafla
Lluís Gómez
Marçal Rusiñol
Ernest Valveny
C. V. Jawahar
Dimosthenis Karatzas
64
348
0
31 May 2019
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
50
1,174
0
18 Apr 2019
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
65
606
0
04 Oct 2018
Analogical Reasoning on Chinese Morphological and Semantic Relations
Shen Li
Zhe Zhao
Renfen Hu
Wensi Li
Tao Liu
Xiaoyong Du
34
409
0
12 May 2018
VizWiz Grand Challenge: Answering Visual Questions from Blind People
Danna Gurari
Qing Li
Abigale Stangl
Anhong Guo
Chi Lin
Kristen Grauman
Jiebo Luo
Jeffrey P. Bigham
CoGe
66
831
0
22 Feb 2018
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
Aishwarya Agrawal
Dhruv Batra
Devi Parikh
Aniruddha Kembhavi
OOD
125
585
0
01 Dec 2017
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
LM&Ro
74
1,299
0
20 Nov 2017
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
266
2,346
0
20 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
285
3,187
0
02 Dec 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Kushal Kafle
Christopher Kanan
OOD
42
238
0
05 Oct 2016
Visual Question Answering: A Survey of Methods and Datasets
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
55
416
0
20 Jul 2016
Bag of Tricks for Efficient Text Classification
Armand Joulin
Edouard Grave
Piotr Bojanowski
Tomas Mikolov
VLM
75
4,596
0
06 Jul 2016
FVQA: Fact-based Visual Question Answering
Peng Wang
Qi Wu
Chunhua Shen
Anton van den Hengel
A. Dick
CoGe
59
455
0
17 Jun 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
161
5,706
0
23 Feb 2016
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images
Andreas Veit
Tomas Matera
Lukás Neumann
Jirí Matas
Serge J. Belongie
212
517
0
26 Jan 2016
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources
Qi Wu
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
48
370
0
22 Nov 2015
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
Baoguang Shi
X. Bai
Cong Yao
VLM
145
2,473
0
21 Jul 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
143
5,421
0
03 May 2015
A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input
Mateusz Malinowski
Mario Fritz
153
695
0
01 Oct 2014
1