ShapeWorld - A new test methodology for multimodal language understanding

14 April 2017

Papers citing "ShapeWorld - A new test methodology for multimodal language understanding"

20 / 20 papers shown

Title
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and Evaluation Michal Golovanevsky William Rudman Vedant Palit Ritambhara Singh Carsten Eickhoff 33 1 0 24 Jun 2024
Testing the Depth of ChatGPT's Comprehension via Cross-Modal Tasks Based on ASCII-Art: GPT3.5's Abilities in Regard to Recognizing and Generating ASCII-Art Are Not Totally Lacking David Bayani MLLM 36 5 0 28 Jul 2023
Scalable Neural-Probabilistic Answer Set Programming Arseny Skryagin Daniel Ochs Devendra Singh Dhami Kristian Kersting 42 5 0 14 Jun 2023
Color Overmodification Emerges from Data-Driven Learning and Pragmatic Reasoning Fei Fang Kunal Sinha Noah D. Goodman Christopher Potts Elisa Kreiss 28 1 0 18 May 2022
RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning Xiaojian Ma Weili Nie Zhiding Yu Huaizu Jiang Chaowei Xiao Yuke Zhu Song-Chun Zhu Anima Anandkumar ViT LRM 30 19 0 24 Apr 2022
Explanatory Learning: Beyond Empiricism in Neural Networks Antonio Norelli Giorgio Mariani Luca Moschella Andrea Santilli Giambattista Parascandolo Simone Melzi Emanuele Rodolà 14 2 0 25 Jan 2022
Open-domain clarification question generation without question examples Julia White Gabriel Poesia Robert D. Hawkins Dorsa Sadigh Noah D. Goodman 30 23 0 19 Oct 2021
SLASH: Embracing Probabilistic Circuits into Neural Answer Set Programming Arseny Skryagin Wolfgang Stammer Daniel Ochs Devendra Singh Dhami Kristian Kersting NAI 33 6 0 07 Oct 2021
Robotic Occlusion Reasoning for Efficient Object Existence Prediction Mengdi Li C. Weber Matthias Kerzel Jae Hee Lee Zheni Zeng Zhiyuan Liu S. Wermter 24 7 0 26 Jul 2021
Rich Semantics Improve Few-shot Learning Mohamed Afham Salman Khan Muhammad Haris Khan Muzammal Naseer Fahad Shahbaz Khan VLM 34 24 0 26 Apr 2021
KANDINSKYPatterns -- An experimental exploration environment for Pattern Analysis and Machine Intelligence Andreas Holzinger Anna Saranti Heimo Mueller 46 10 0 28 Feb 2021
Learning Transferable Visual Models From Natural Language Supervision Alec Radford Jong Wook Kim Chris Hallacy Aditya A. Ramesh Gabriel Goh ... Amanda Askell Pamela Mishkin Jack Clark Gretchen Krueger Ilya Sutskever CLIP VLM 183 27,846 0 26 Feb 2021
Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI L. Arras Ahmed Osman Wojciech Samek XAI AAML 21 150 0 16 Mar 2020
A Benchmark for Systematic Generalization in Grounded Language Understanding Laura Ruis Jacob Andreas Marco Baroni Diane Bouchacourt Brenden M. Lake 19 138 0 11 Mar 2020
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity Huiyuan Xie Tom Sherborne A. Kuhnle Ann A. Copestake DiffM 25 9 0 19 Dec 2019
What is needed for simple spatial language capabilities in VQA? A. Kuhnle Ann A. Copestake CoGe 20 1 0 17 Aug 2019
Emergent Linguistic Phenomena in Multi-Agent Communication Games L. Graesser Kyunghyun Cho Douwe Kiela LLMAG AI4CE 25 70 0 25 Jan 2019
How clever is the FiLM model, and how clever can it be? A. Kuhnle Huiyuan Xie Ann A. Copestake 30 6 0 09 Sep 2018
A new dataset and model for learning to understand navigational instructions Ozan Arkan Can Deniz Yuret 26 1 0 21 May 2018
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding Akira Fukui Dong Huk Park Daylen Yang Anna Rohrbach Trevor Darrell Marcus Rohrbach 167 1,465 0 06 Jun 2016