ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08218
  4. Cited By
VizWiz Grand Challenge: Answering Visual Questions from Blind People

VizWiz Grand Challenge: Answering Visual Questions from Blind People

22 February 2018
Danna Gurari
Qing Li
Abigale Stangl
Anhong Guo
Chi Lin
Kristen Grauman
Jiebo Luo
Jeffrey P. Bigham
    CoGe
ArXivPDFHTML

Papers citing "VizWiz Grand Challenge: Answering Visual Questions from Blind People"

50 / 172 papers shown
Title
SCOB: Universal Text Understanding via Character-wise Supervised
  Contrastive Learning with Online Text Rendering for Bridging Domain Gap
SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap
Daehee Kim
Yoon Kim
Donghyun Kim
Yumin Lim
Geewook Kim
Taeho Kil
36
3
0
21 Sep 2023
An Outlook into the Future of Egocentric Vision
An Outlook into the Future of Egocentric Vision
Chiara Plizzari
Gabriele Goletto
Antonino Furnari
Siddhant Bansal
Francesco Ragusa
G. Farinella
Dima Damen
Tatiana Tommasi
EgoV
40
38
0
14 Aug 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming Yang
Fahad Shahbaz Khan
VLM
40
119
0
25 Jul 2023
PaLI-X: On Scaling up a Multilingual Vision and Language Model
PaLI-X: On Scaling up a Multilingual Vision and Language Model
Xi Chen
Josip Djolonga
Piotr Padlewski
Basil Mustafa
Soravit Changpinyo
...
Mojtaba Seyedhosseini
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
VLM
73
190
0
29 May 2023
Helping Visually Impaired People Take Better Quality Pictures
Helping Visually Impaired People Take Better Quality Pictures
Maniratnam Mandal
Deepti Ghadiyaram
Danna Gurari
A. Bovik
13
3
0
14 May 2023
I2I: Initializing Adapters with Improvised Knowledge
I2I: Initializing Adapters with Improvised Knowledge
Tejas Srinivasan
Furong Jia
Mohammad Rostami
Jesse Thomason
CLL
32
6
0
04 Apr 2023
Locate Then Generate: Bridging Vision and Language with Bounding Box for
  Scene-Text VQA
Locate Then Generate: Bridging Vision and Language with Bounding Box for Scene-Text VQA
Yongxin Zhu
Zichen Liu
Yukang Liang
Xin Li
Hao Liu
Changcun Bao
Linli Xu
26
6
0
04 Apr 2023
Toward Unsupervised Realistic Visual Question Answering
Toward Unsupervised Realistic Visual Question Answering
Yuwei Zhang
Chih-Hui Ho
Nuno Vasconcelos
CoGe
22
2
0
09 Mar 2023
VTQA: Visual Text Question Answering via Entity Alignment and
  Cross-Media Reasoning
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning
Kan Chen
Xiangqian Wu
CoGe
27
8
0
05 Mar 2023
Can Pre-trained Vision and Language Models Answer Visual
  Information-Seeking Questions?
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
Yang Chen
Hexiang Hu
Yi Luan
Haitian Sun
Soravit Changpinyo
Alan Ritter
Ming-Wei Chang
48
80
0
23 Feb 2023
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution
  Generalization of VQA Models
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution Generalization of VQA Models
Ali Borji
CoGe
15
1
0
28 Jan 2023
Salient Object Detection for Images Taken by People With Vision
  Impairments
Salient Object Detection for Images Taken by People With Vision Impairments
Jarek Reynolds
Chandra Kanth Nagesh
Danna Gurari
33
10
0
12 Jan 2023
SceneGATE: Scene-Graph based co-Attention networks for TExt visual
  question answering
SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Feiqi Cao
Siwen Luo
F. Núñez
Zean Wen
Josiah Poon
Caren Han
GNN
26
4
0
16 Dec 2022
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual
  Reasoning
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
Zhuowan Li
Xingrui Wang
Elias Stengel-Eskin
Adam Kortylewski
Wufei Ma
Benjamin Van Durme
Max Planck Institute for Informatics
OOD
LRM
33
58
0
01 Dec 2022
Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous
  Questions in VQA
Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous Questions in VQA
Elias Stengel-Eskin
Jimena Guallar-Blasco
Yi Zhou
Benjamin Van Durme
UQLM
35
11
0
14 Nov 2022
What's Different between Visual Question Answering for Machine
  "Understanding" Versus for Accessibility?
What's Different between Visual Question Answering for Machine "Understanding" Versus for Accessibility?
Yang Trista Cao
Kyle Seelman
Kyungjun Lee
Hal Daumé
22
5
0
26 Oct 2022
Multilingual Multimodal Learning with Machine Translated Text
Multilingual Multimodal Learning with Machine Translated Text
Chen Qiu
Dan Oneaţă
Emanuele Bugliarello
Stella Frank
Desmond Elliott
55
13
0
24 Oct 2022
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots
Yu-Chung Hsiao
Fedir Zubach
Maria Wang
Jindong Chen
Victor Carbune
Jason Lin
Maria Wang
Yun Zhu
Jindong Chen
RALM
160
26
0
16 Sep 2022
MaXM: Towards Multilingual Visual Question Answering
MaXM: Towards Multilingual Visual Question Answering
Soravit Changpinyo
Linting Xue
Michal Yarom
Ashish V. Thapliyal
Idan Szpektor
J. Amelot
Xi Chen
Radu Soricut
33
8
0
12 Sep 2022
Exploring and Improving the Accessibility of Data Privacy-related
  Information for People Who Are Blind or Low-vision
Exploring and Improving the Accessibility of Data Privacy-related Information for People Who Are Blind or Low-vision
Yuanyuan Feng
Abhilasha Ravichander
Yaxing Yao
Shikun Zhang
Norman M. Sadeh
24
1
0
21 Aug 2022
Curriculum Learning for Data-Efficient Vision-Language Alignment
Curriculum Learning for Data-Efficient Vision-Language Alignment
Tejas Srinivasan
Xiang Ren
Jesse Thomason
VLM
31
7
0
29 Jul 2022
VizWiz-FewShot: Locating Objects in Images Taken by People With Visual
  Impairments
VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments
Yu-Yun Tseng
Alexander Bell
Danna Gurari
31
8
0
24 Jul 2022
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Jiasen Lu
Christopher Clark
Rowan Zellers
Roozbeh Mottaghi
Aniruddha Kembhavi
ObjD
VLM
MLLM
77
393
0
17 Jun 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
61
529
0
27 May 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chen Chen
VLM
27
31
0
26 May 2022
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
A. Piergiovanni
Wei Li
Weicheng Kuo
M. Saffar
Fred Bertsch
A. Angelova
17
16
0
02 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
51
3,360
0
29 Apr 2022
Brainish: Formalizing A Multimodal Language for Intelligence and
  Consciousness
Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness
Paul Pu Liang
30
4
0
14 Apr 2022
"It Feels Like Taking a Gamble": Exploring Perceptions, Practices, and
  Challenges of Using Makeup and Cosmetics for People with Visual Impairments
"It Feels Like Taking a Gamble": Exploring Perceptions, Practices, and Challenges of Using Makeup and Cosmetics for People with Visual Impairments
Franklin Mingzhe Li
F. Spektor
Menglin Xia
Mina Huh
Peter Cederberg
Yuqi Gong
Kristen Shinohara
Patrick Carrington
38
25
0
16 Mar 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual
  Concept Recognition
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition
Peipei Zhu
Tianlin Li
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
Chen Chen
30
12
0
07 Mar 2022
Grounding Answers for Visual Questions Asked by Visually Impaired People
Grounding Answers for Visual Questions Asked by Visually Impaired People
Chongyan Chen
Samreen Anjum
Danna Gurari
27
50
0
04 Feb 2022
Feasibility of Interactive 3D Map for Remote Sighted Assistance
Feasibility of Interactive 3D Map for Remote Sighted Assistance
Jingyi Xie
Rui Yu
Sooyeon Lee
Yao Lyu
Syed Masum Billah
John M. Carroll
15
0
0
03 Feb 2022
Question Generation for Evaluating Cross-Dataset Shifts in Multi-modal
  Grounding
Question Generation for Evaluating Cross-Dataset Shifts in Multi-modal Grounding
Arjun Reddy Akula
OOD
23
3
0
24 Jan 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
LaTr: Layout-Aware Transformer for Scene-Text VQA
Ali Furkan Biten
Ron Litman
Yusheng Xie
Srikar Appalaraju
R. Manmatha
ViT
36
100
0
23 Dec 2021
Understanding and Measuring Robustness of Multimodal Learning
Understanding and Measuring Robustness of Multimodal Learning
Nishant Vishwamitra
Hongxin Hu
Ziming Zhao
Long Cheng
Feng Luo
AAML
27
5
0
22 Dec 2021
3D Question Answering
3D Question Answering
Shuquan Ye
Dongdong Chen
Songfang Han
Jing Liao
ViT
31
47
0
15 Dec 2021
AssistSR: Task-oriented Video Segment Retrieval for Personal AI
  Assistant
AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant
Stan Weixian Lei
Difei Gao
Yuxuan Wang
Dongxing Mao
Zihan Liang
L. Ran
Mike Zheng Shou
27
8
0
30 Nov 2021
Single-Modal Entropy based Active Learning for Visual Question Answering
Single-Modal Entropy based Active Learning for Visual Question Answering
Dong-Jin Kim
Jae-Won Cho
Jinsoo Choi
Yunjae Jung
In So Kweon
25
12
0
21 Oct 2021
Asking questions on handwritten document collections
Asking questions on handwritten document collections
Minesh Mathew
Lluís Gómez
Dimosthenis Karatzas
C. V. Jawahar
RALM
33
11
0
02 Oct 2021
Image Captioning for Effective Use of Language Models in Knowledge-Based
  Visual Question Answering
Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering
Ander Salaberria
Gorka Azkune
Oier López de Lacalle
Aitor Soroa Etxabe
Eneko Agirre
33
59
0
15 Sep 2021
Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling
Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling
Xiaopeng Lu
Zhenhua Fan
Yansen Wang
Jean Oh
Carolyn Rose
27
27
0
20 Aug 2021
Language Grounding with 3D Objects
Language Grounding with 3D Objects
Jesse Thomason
Mohit Shridhar
Yonatan Bisk
Chris Paxton
Luke Zettlemoyer
LM&Ro
28
53
0
26 Jul 2021
Recent Advances and Trends in Multimodal Deep Learning: A Review
Recent Advances and Trends in Multimodal Deep Learning: A Review
Jabeen Summaira
Xi Li
Amin Muhammad Shoib
Songyuan Li
Abdul Jabbar
HAI
20
55
0
24 May 2021
InfographicVQA
InfographicVQA
Minesh Mathew
Viraj Bagal
Rubèn Pérez Tito
Dimosthenis Karatzas
Ernest Valveny
C. V. Jawahar
42
209
0
26 Apr 2021
Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task
  Feasibility in Interactive Visual Environments
Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments
Andrea Burns
Deniz Arsan
Sanjna Agrawal
Ranjitha Kumar
Kate Saenko
Bryan A. Plummer
LRM
21
21
0
17 Apr 2021
Towards a Collective Agenda on AI for Earth Science Data Analysis
Towards a Collective Agenda on AI for Earth Science Data Analysis
D. Tuia
R. Roscher
Jan Dirk Wegner
Nathan Jacobs
Xiaoxiang Zhu
Gustau Camps-Valls
AI4CE
44
68
0
11 Apr 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse
  Sampling
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
46
648
0
11 Feb 2021
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
Zhengyuan Yang
Yijuan Lu
Jianfeng Wang
Xi Yin
D. Florêncio
Lijuan Wang
Cha Zhang
Lei Zhang
Jiebo Luo
VLM
28
141
0
08 Dec 2020
CapWAP: Captioning with a Purpose
CapWAP: Captioning with a Purpose
Adam Fisch
Kenton Lee
Ming-Wei Chang
J. Clark
Regina Barzilay
8
11
0
09 Nov 2020
Visual Question Answering on Image Sets
Visual Question Answering on Image Sets
Ankan Bansal
Yuting Zhang
Rama Chellappa
CoGe
16
40
0
27 Aug 2020
Previous
1234
Next