ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.10830
  4. Cited By
From Recognition to Cognition: Visual Commonsense Reasoning

From Recognition to Cognition: Visual Commonsense Reasoning

27 November 2018
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
    LRM
    BDL
    OCL
    ReLM
ArXivPDFHTML

Papers citing "From Recognition to Cognition: Visual Commonsense Reasoning"

50 / 587 papers shown
Title
To Root Artificial Intelligence Deeply in Basic Science for a New
  Generation of AI
To Root Artificial Intelligence Deeply in Basic Science for a New Generation of AI
Jingan Yang
Ya-li Peng
28
0
0
11 Sep 2020
A Dataset and Baselines for Visual Question Answering on Art
A Dataset and Baselines for Visual Question Answering on Art
Noa Garcia
Chentao Ye
Zihua Liu
Qingtao Hu
Mayu Otani
Chenhui Chu
Yuta Nakashima
Teruko Mitamura
CoGe
8
52
0
28 Aug 2020
AiR: Attention with Reasoning Capability
AiR: Attention with Reasoning Capability
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
13
36
0
28 Jul 2020
Who Watches the Watchmen? A Review of Subjective Approaches for
  Sybil-resistance in Proof of Personhood Protocols
Who Watches the Watchmen? A Review of Subjective Approaches for Sybil-resistance in Proof of Personhood Protocols
Divya Siddarth
S. Ivliev
S. Siri
Paula Berman
11
30
0
26 Jul 2020
Towards Debiasing Sentence Representations
Towards Debiasing Sentence Representations
Paul Pu Liang
Irene Z Li
Emily Zheng
Y. Lim
Ruslan Salakhutdinov
Louis-Philippe Morency
18
231
0
16 Jul 2020
Reducing Language Biases in Visual Question Answering with
  Visually-Grounded Question Encoder
Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder
K. Gouthaman
Anurag Mittal
50
78
0
13 Jul 2020
The Impact of Explanations on AI Competency Prediction in VQA
The Impact of Explanations on AI Competency Prediction in VQA
Kamran Alipour
Arijit Ray
Xiaoyu Lin
J. Schulze
Yi Yao
Giedrius Burachas
24
9
0
02 Jul 2020
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through
  Scene Graph
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Fei Yu
Jiji Tang
Weichong Yin
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
31
376
0
30 Jun 2020
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"
Saeed Amizadeh
Hamid Palangi
Oleksandr Polozov
Yichen Huang
K. Koishida
NAI
LRM
39
58
0
20 Jun 2020
Learning Visual Commonsense for Robust Scene Graph Generation
Learning Visual Commonsense for Robust Scene Graph Generation
Alireza Zareian
Zhecan Wang
Haoxuan You
Shih-Fu Chang
27
312
0
17 Jun 2020
Learning from the Scene and Borrowing from the Rich: Tackling the Long
  Tail in Scene Graph Generation
Learning from the Scene and Borrowing from the Rich: Tackling the Long Tail in Scene Graph Generation
Tao He
Lianli Gao
Jingkuan Song
Jianfei Cai
Yuan-Fang Li
24
30
0
13 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
30
432
0
11 Jun 2020
Large-Scale Adversarial Training for Vision-and-Language Representation
  Learning
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
35
488
0
11 Jun 2020
Counterfactual VQA: A Cause-Effect Look at Language Bias
Counterfactual VQA: A Cause-Effect Look at Language Bias
Yulei Niu
Kaihua Tang
Hanwang Zhang
Zhiwu Lu
Xiansheng Hua
Ji-Rong Wen
CML
53
394
0
08 Jun 2020
Give Me Something to Eat: Referring Expression Comprehension with
  Commonsense Knowledge
Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge
Peng Wang
Dongyang Liu
Hui Li
Qi Wu
ObjD
24
19
0
02 Jun 2020
Behind the Scene: Revealing the Secrets of Pre-trained
  Vision-and-Language Models
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
Jize Cao
Zhe Gan
Yu Cheng
Licheng Yu
Yen-Chun Chen
Jingjing Liu
VLM
19
127
0
15 May 2020
Scones: Towards Conversational Authoring of Sketches
Scones: Towards Conversational Authoring of Sketches
Forrest Huang
E. Schoop
David R Ha
John F. Canny
21
24
0
12 May 2020
What-if I ask you to explain: Explaining the effects of perturbations in
  procedural text
What-if I ask you to explain: Explaining the effects of perturbations in procedural text
Dheeraj Rajagopal
Niket Tandon
Bhavana Dalvi
Peter Clarke
Eduard H. Hovy
23
14
0
04 May 2020
Probing Contextual Language Models for Common Ground with Visual
  Representations
Probing Contextual Language Models for Common Ground with Visual Representations
Gabriel Ilharco
Rowan Zellers
Ali Farhadi
Hannaneh Hajishirzi
30
14
0
01 May 2020
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning
E. Ponti
Goran Glavavs
Olga Majewska
Qianchu Liu
Ivan Vulić
Anna Korhonen
LRM
15
305
0
01 May 2020
Visuo-Linguistic Question Answering (VLQA) Challenge
Visuo-Linguistic Question Answering (VLQA) Challenge
Shailaja Keyur Sampat
Yezhou Yang
Chitta Baral
CoGe
13
1
0
01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation
  Pre-training
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
43
493
0
01 May 2020
Crisscrossed Captions: Extended Intramodal and Intermodal Semantic
  Similarity Judgments for MS-COCO
Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO
Zarana Parekh
Jason Baldridge
Daniel Cer
Austin Waters
Yinfei Yang
11
61
0
30 Apr 2020
Explainable Deep Learning: A Field Guide for the Uninitiated
Explainable Deep Learning: A Field Guide for the Uninitiated
Gabrielle Ras
Ning Xie
Marcel van Gerven
Derek Doran
AAML
XAI
41
371
0
30 Apr 2020
VD-BERT: A Unified Vision and Dialog Transformer with BERT
VD-BERT: A Unified Vision and Dialog Transformer with BERT
Yue Wang
Chenyu You
Michael R. Lyu
Irwin King
Caiming Xiong
Guosheng Lin
24
102
0
28 Apr 2020
PuzzLing Machines: A Challenge on Learning From Small Data
PuzzLing Machines: A Challenge on Learning From Small Data
Gözde Gül Sahin
Yova Kementchedjhieva
Phillip Rust
Iryna Gurevych
AAML
LRM
15
14
0
27 Apr 2020
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
J. S. Park
Chandra Bhagavatula
Roozbeh Mottaghi
Ali Farhadi
Yejin Choi
ReLM
LRM
24
6
0
22 Apr 2020
Experience Grounds Language
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
21
351
0
21 Apr 2020
Learning to Scale Multilingual Representations for Vision-Language Tasks
Learning to Scale Multilingual Representations for Vision-Language Tasks
Andrea Burns
Donghyun Kim
Derry Wijaya
Kate Saenko
Bryan A. Plummer
13
35
0
09 Apr 2020
Understanding Knowledge Gaps in Visual Question Answering: Implications
  for Gap Identification and Testing
Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing
Goonmeet Bajaj
Bortik Bandyopadhyay
Daniela Schmidt
Pranav Maneriker
Christopher Myers
Srinivasan Parthasarathy
23
1
0
08 Apr 2020
Generating Rationales in Visual Question Answering
Generating Rationales in Visual Question Answering
Hammad A. Ayyubi
Md. Mehrab Tanjim
Julian McAuley
G. Cottrell
LRM
14
5
0
04 Apr 2020
Benchmarking Machine Reading Comprehension: A Psychological Perspective
Benchmarking Machine Reading Comprehension: A Psychological Perspective
Saku Sugawara
Pontus Stenetorp
Akiko Aizawa
16
2
0
04 Apr 2020
SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings
SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings
Wenyu Han
Siyuan Xiang
Chenhui Liu
Ruoyu Wang
Chen Feng
14
16
0
31 Mar 2020
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene
  Text
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Difei Gao
Ke Li
Ruiping Wang
Shiguang Shan
Xilin Chen
16
111
0
31 Mar 2020
Learning Interactions and Relationships between Movie Characters
Learning Interactions and Relationships between Movie Characters
Anna Kukleva
Makarand Tapaswi
Ivan Laptev
41
51
0
29 Mar 2020
Modulating Bottom-Up and Top-Down Visual Processing via
  Language-Conditional Filters
Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional Filters
.Ilker Kesen
Ozan Arkan Can
Erkut Erdem
Aykut Erdem
Deniz Yuret
VLM
8
1
0
28 Mar 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLM
VGen
43
68
0
25 Mar 2020
Video Object Grounding using Semantic Roles in Language Description
Video Object Grounding using Semantic Roles in Language Description
Arka Sadhu
Kan Chen
Ram Nevatia
18
48
0
24 Mar 2020
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video
  Captioning
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Zhiyuan Fang
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
20
60
0
11 Mar 2020
XGPT: Cross-modal Generative Pre-Training for Image Captioning
XGPT: Cross-modal Generative Pre-Training for Image Captioning
Qiaolin Xia
Haoyang Huang
Nan Duan
Dongdong Zhang
Lei Ji
Zhifang Sui
Edward Cui
Taroon Bharti
Xin Liu
Ming Zhou
MLLM
VLM
25
74
0
03 Mar 2020
Visual Commonsense R-CNN
Visual Commonsense R-CNN
Tan Wang
Jianqiang Huang
Hanwang Zhang
Qianru Sun
SSL
ObjD
CML
16
245
0
27 Feb 2020
VQA-LOL: Visual Question Answering under the Lens of Logic
VQA-LOL: Visual Question Answering under the Lens of Logic
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
CoGe
22
73
0
19 Feb 2020
Solving Raven's Progressive Matrices with Neural Networks
Solving Raven's Progressive Matrices with Neural Networks
Tao Zhuo
Mohan S. Kankanhalli
27
26
0
05 Feb 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
119
275
0
24 Jan 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised
  Image-Text Data
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
40
259
0
22 Jan 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
23
17
0
20 Jan 2020
Weakly Supervised Visual Semantic Parsing
Weakly Supervised Visual Semantic Parsing
Alireza Zareian
Svebor Karaman
Shih-Fu Chang
GNN
25
57
0
08 Jan 2020
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
28
241
0
06 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art
  Baseline
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
23
115
0
05 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
12-in-1: Multi-Task Vision and Language Representation Learning
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLM
ObjD
38
476
0
05 Dec 2019
Previous
123...101112
Next