ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08218
  4. Cited By
VizWiz Grand Challenge: Answering Visual Questions from Blind People
v1v2v3v4 (latest)

VizWiz Grand Challenge: Answering Visual Questions from Blind People

22 February 2018
Danna Gurari
Qing Li
Abigale Stangl
Anhong Guo
Chi Lin
Kristen Grauman
Jiebo Luo
Jeffrey P. Bigham
    CoGe
ArXiv (abs)PDFHTML

Papers citing "VizWiz Grand Challenge: Answering Visual Questions from Blind People"

50 / 573 papers shown
Title
A Dataset of Alt Texts from HCI Publications: Analyses and Uses Towards
  Producing More Descriptive Alt Texts of Data Visualizations in Scientific
  Papers
A Dataset of Alt Texts from HCI Publications: Analyses and Uses Towards Producing More Descriptive Alt Texts of Data Visualizations in Scientific Papers
S. Chintalapati
Jonathan Bragg
Lucy Lu Wang
77
23
0
27 Sep 2022
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots
Yu-Chung Hsiao
Fedir Zubach
Maria Wang
Jindong Chen
Victor Carbune
Jason Lin
Maria Wang
Yun Zhu
Jindong Chen
RALM
215
30
0
16 Sep 2022
PaLI: A Jointly-Scaled Multilingual Language-Image Model
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLMVLM
211
742
0
14 Sep 2022
PreSTU: Pre-Training for Scene-Text Understanding
PreSTU: Pre-Training for Scene-Text Understanding
Jihyung Kil
Soravit Changpinyo
Xi Chen
Hexiang Hu
Sebastian Goodman
Wei-Lun Chao
Radu Soricut
VLM
191
29
0
12 Sep 2022
MaXM: Towards Multilingual Visual Question Answering
MaXM: Towards Multilingual Visual Question Answering
Soravit Changpinyo
Linting Xue
Michal Yarom
Ashish V. Thapliyal
Idan Szpektor
J. Amelot
Xi Chen
Radu Soricut
117
8
0
12 Sep 2022
Foundations and Trends in Multimodal Machine Learning: Principles,
  Challenges, and Open Questions
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
114
89
0
07 Sep 2022
Exploring and Improving the Accessibility of Data Privacy-related
  Information for People Who Are Blind or Low-vision
Exploring and Improving the Accessibility of Data Privacy-related Information for People Who Are Blind or Low-vision
Yuanyuan Feng
Abhilasha Ravichander
Yaxing Yao
Shikun Zhang
Norman M. Sadeh
47
1
0
21 Aug 2022
TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Jun Wang
M. Gao
Yuqian Hu
Ramprasaath R. Selvaraju
Chetan Ramaiah
Ran Xu
Joseph Jaja
Larry S. Davis
ViT
72
18
0
03 Aug 2022
Generative Bias for Robust Visual Question Answering
Generative Bias for Robust Visual Question Answering
Jae-Won Cho
Dong-Jin Kim
H. Ryu
In So Kweon
OODCML
100
20
0
01 Aug 2022
Curriculum Learning for Data-Efficient Vision-Language Alignment
Curriculum Learning for Data-Efficient Vision-Language Alignment
Tejas Srinivasan
Xiang Ren
Jesse Thomason
VLM
73
9
0
29 Jul 2022
VizWiz-FewShot: Locating Objects in Images Taken by People With Visual
  Impairments
VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments
Yu-Yun Tseng
Alexander Bell
Danna Gurari
86
8
0
24 Jul 2022
Data Representativeness in Accessibility Datasets: A Meta-Analysis
Data Representativeness in Accessibility Datasets: A Meta-Analysis
Rie Kamikubo
Lining Wang
Crystal Marte
Amnah Mahmood
Hernisa Kacorri
68
18
0
16 Jul 2022
Modern Question Answering Datasets and Benchmarks: A Survey
Modern Question Answering Datasets and Benchmarks: A Survey
Zhen Wang
85
23
0
30 Jun 2022
CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks
CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks
Tejas Srinivasan
Ting-Yun Chang
Leticia Pinto-Alva
Georgios Chochlakis
Mohammad Rostami
Jesse Thomason
VLMCLL
103
76
0
18 Jun 2022
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Jiasen Lu
Christopher Clark
Rowan Zellers
Roozbeh Mottaghi
Aniruddha Kembhavi
ObjDVLMMLLM
171
412
0
17 Jun 2022
Less Is More: Linear Layers on CLIP Features as Powerful VizWiz Model
Less Is More: Linear Layers on CLIP Features as Powerful VizWiz Model
Fabian Deuser
Konrad Habel
Philipp J. Rösch
Norbert Oswald
VLM
18
2
0
10 Jun 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
174
562
0
27 May 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chen Chen
VLM
97
33
0
26 May 2022
Reassessing Evaluation Practices in Visual Question Answering: A Case
  Study on Out-of-Distribution Generalization
Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization
Aishwarya Agrawal
Ivana Kajić
Emanuele Bugliarello
Elnaz Davoodi
Anita Gergely
Phil Blunsom
Aida Nematzadeh
OOD
92
18
0
24 May 2022
Gender and Racial Bias in Visual Question Answering Datasets
Gender and Racial Bias in Visual Question Answering Datasets
Yusuke Hirota
Yuta Nakashima
Noa Garcia
FaML
187
55
0
17 May 2022
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
A. Piergiovanni
Wei Li
Weicheng Kuo
M. Saffar
Fred Bertsch
A. Angelova
77
16
0
02 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLMVLM
431
3,620
0
29 Apr 2022
Reliable Visual Question Answering: Abstain Rather Than Answer
  Incorrectly
Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly
Spencer Whitehead
Suzanne Petryk
Vedaad Shakib
Joseph E. Gonzalez
Trevor Darrell
Anna Rohrbach
Marcus Rohrbach
105
56
0
28 Apr 2022
Brainish: Formalizing A Multimodal Language for Intelligence and
  Consciousness
Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness
Paul Pu Liang
78
6
0
14 Apr 2022
Multi-modal Misinformation Detection: Approaches, Challenges and
  Opportunities
Multi-modal Misinformation Detection: Approaches, Challenges and Opportunities
S. Abdali
Sina shaham
Bhaskar Krishnamachari
124
24
0
25 Mar 2022
"It Feels Like Taking a Gamble": Exploring Perceptions, Practices, and
  Challenges of Using Makeup and Cosmetics for People with Visual Impairments
"It Feels Like Taking a Gamble": Exploring Perceptions, Practices, and Challenges of Using Makeup and Cosmetics for People with Visual Impairments
Franklin Mingzhe Li
F. Spektor
Menglin Xia
Mina Huh
Peter Cederberg
Yuqi Gong
Kristen Shinohara
Patrick Carrington
57
25
0
16 Mar 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual
  Concept Recognition
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition
Peipei Zhu
Tianlin Li
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
Chen Chen
102
12
0
07 Mar 2022
A Review on Methods and Applications in Multimodal Deep Learning
A Review on Methods and Applications in Multimodal Deep Learning
Summaira Jabeen
Xi Li
Muhammad Shoib Amin
Abdul Jabbar
VLMHAI
73
102
0
18 Feb 2022
A Dataset for Interactive Vision-Language Navigation with Unknown
  Command Feasibility
A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility
Andrea Burns
Deniz Arsan
Sanjna Agrawal
Ranjitha Kumar
Kate Saenko
Bryan A. Plummer
131
65
0
04 Feb 2022
Grounding Answers for Visual Questions Asked by Visually Impaired People
Grounding Answers for Visual Questions Asked by Visually Impaired People
Chongyan Chen
Samreen Anjum
Danna Gurari
109
49
0
04 Feb 2022
Feasibility of Interactive 3D Map for Remote Sighted Assistance
Feasibility of Interactive 3D Map for Remote Sighted Assistance
Jingyi Xie
Rui Yu
Sooyeon Lee
Yao Lyu
Syed Masum Billah
John M. Carroll
32
0
0
03 Feb 2022
Question Generation for Evaluating Cross-Dataset Shifts in Multi-modal
  Grounding
Question Generation for Evaluating Cross-Dataset Shifts in Multi-modal Grounding
Arjun Reddy Akula
OOD
116
3
0
24 Jan 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
LaTr: Layout-Aware Transformer for Scene-Text VQA
Ali Furkan Biten
Ron Litman
Yusheng Xie
Srikar Appalaraju
R. Manmatha
ViT
125
102
0
23 Dec 2021
Understanding and Measuring Robustness of Multimodal Learning
Understanding and Measuring Robustness of Multimodal Learning
Nishant Vishwamitra
Hongxin Hu
Ziming Zhao
Long Cheng
Feng Luo
AAML
86
5
0
22 Dec 2021
3D Question Answering
3D Question Answering
Shuquan Ye
Dongdong Chen
Songfang Han
Jing Liao
ViT
94
49
0
15 Dec 2021
Dual-Key Multimodal Backdoors for Visual Question Answering
Dual-Key Multimodal Backdoors for Visual Question Answering
Matthew Walmer
Karan Sikka
Indranil Sur
Abhinav Shrivastava
Susmit Jha
AAML
78
38
0
14 Dec 2021
MAGMA -- Multimodal Augmentation of Generative Models through
  Adapter-based Finetuning
MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
C. Eichenberg
Sid Black
Samuel Weinbach
Letitia Parcalabescu
Anette Frank
MLLMVLM
72
101
0
09 Dec 2021
From Coarse to Fine-grained Concept based Discrimination for Phrase
  Detection
From Coarse to Fine-grained Concept based Discrimination for Phrase Detection
Maan Qraitem
Bryan A. Plummer
ObjD
64
0
0
06 Dec 2021
AssistSR: Task-oriented Video Segment Retrieval for Personal AI
  Assistant
AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant
Stan Weixian Lei
Difei Gao
Yuxuan Wang
Dongxing Mao
Zihan Liang
L. Ran
Mike Zheng Shou
69
8
0
30 Nov 2021
NarrationBot and InfoBot: A Hybrid System for Automated Video Description
Shasta Ihorn
Y. Siu
Aditya Bodi
Lothar D Narins
Jose M. Castanon
Yash Kant
Abhishek Das
Ilmi Yoon
Pooyan Fazli
46
3
0
07 Nov 2021
Single-Modal Entropy based Active Learning for Visual Question Answering
Single-Modal Entropy based Active Learning for Visual Question Answering
Dong-Jin Kim
Jae-Won Cho
Jinsoo Choi
Yunjae Jung
In So Kweon
63
12
0
21 Oct 2021
Evaluating and Improving Interactions with Hazy Oracles
Evaluating and Improving Interactions with Hazy Oracles
Stephan J. Lemmer
Jason J. Corso
41
2
0
19 Oct 2021
Asking questions on handwritten document collections
Asking questions on handwritten document collections
Minesh Mathew
Lluís Gómez
Dimosthenis Karatzas
C. V. Jawahar
RALM
130
11
0
02 Oct 2021
Image Captioning for Effective Use of Language Models in Knowledge-Based
  Visual Question Answering
Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering
Ander Salaberria
Gorka Azkune
Oier López de Lacalle
Aitor Soroa Etxabe
Eneko Agirre
92
61
0
15 Sep 2021
Sharing Practices for Datasets Related to Accessibility and Aging
Sharing Practices for Datasets Related to Accessibility and Aging
Rie Kamikubo
Utkarsh Dwivedi
Hernisa Kacorri
50
12
0
24 Aug 2021
Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling
Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling
Xiaopeng Lu
Zhenhua Fan
Yansen Wang
Jean Oh
Carolyn Rose
84
27
0
20 Aug 2021
Language Grounding with 3D Objects
Language Grounding with 3D Objects
Jesse Thomason
Mohit Shridhar
Yonatan Bisk
Chris Paxton
Luke Zettlemoyer
LM&Ro
88
53
0
26 Jul 2021
Human-Adversarial Visual Question Answering
Human-Adversarial Visual Question Answering
Sasha Sheng
Amanpreet Singh
Vedanuj Goswami
Jose Alberto Lopez Magana
Wojciech Galuba
Devi Parikh
Douwe Kiela
OODEgoVAAML
58
63
0
04 Jun 2021
Recent Advances and Trends in Multimodal Deep Learning: A Review
Recent Advances and Trends in Multimodal Deep Learning: A Review
Jabeen Summaira
Xi Li
Amin Muhammad Shoib
Songyuan Li
Abdul Jabbar
HAI
237
59
0
24 May 2021
Found a Reason for me? Weakly-supervised Grounded Visual Question
  Answering using Capsules
Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules
Aisha Urooj Khan
Hilde Kuehne
Kevin Duarte
Chuang Gan
N. Lobo
M. Shah
71
36
0
11 May 2021
Previous
123...1011129
Next