Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification

22 June 2024

Papers citing "Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification"

16 / 16 papers shown

Title
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 424 4,539 0 30 Jan 2023
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation Jiahui Yu Yuanzhong Xu Jing Yu Koh Thang Luong Gunjan Baid ... Zarana Parekh Xin Li Han Zhang Jason Baldridge Yonghui Wu EGVM 178 1,117 0 22 Jun 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents Aditya A. Ramesh Prafulla Dhariwal Alex Nichol Casey Chu Mark Chen VLM DiffM 377 6,859 0 13 Apr 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong Guosheng Lin MLLM BDL VLM CLIP 524 4,343 0 28 Jan 2022
Rich Semantics Improve Few-shot Learning Mohamed Afham Salman Khan Muhammad Haris Khan Muzammal Naseer Fahad Shahbaz Khan VLM 81 24 0 26 Apr 2021
Deep Learning Benchmarks and Datasets for Social Media Image Classification for Disaster Response Firoj Alam Ferda Ofli Muhammad Imran Tanvirul Alam U. Qazi VLM 42 53 0 17 Nov 2020
Concept Bottleneck Models Pang Wei Koh Thao Nguyen Y. S. Tang Stephen Mussmann Emma Pierson Been Kim Percy Liang 94 821 0 09 Jul 2020
ExpBERT: Representation Engineering with Natural Language Explanations Shikhar Murty Pang Wei Koh Percy Liang 61 43 0 05 May 2020
A Unified Approach to Interpreting Model Predictions Scott M. Lundberg Su-In Lee FAtt 1.1K 21,864 0 22 May 2017
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Ramprasaath R. Selvaraju Michael Cogswell Abhishek Das Ramakrishna Vedantam Devi Parikh Dhruv Batra FAtt 294 20,003 0 07 Oct 2016
"Why Should I Trust You?": Explaining the Predictions of Any Classifier Marco Tulio Ribeiro Sameer Singh Carlos Guestrin FAtt FaML 1.2K 16,954 0 16 Feb 2016
Rethinking the Inception Architecture for Computer Vision Christian Szegedy Vincent Vanhoucke Sergey Ioffe Jonathon Shlens Z. Wojna 3DV BDL 875 27,358 0 02 Dec 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Ke Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhutdinov R. Zemel Yoshua Bengio DiffM 334 10,069 0 10 Feb 2015
Show and Tell: A Neural Image Caption Generator Oriol Vinyals Alexander Toshev Samy Bengio D. Erhan 3DV 237 6,028 0 17 Nov 2014
Sequence to Sequence Learning with Neural Networks Ilya Sutskever Oriol Vinyals Quoc V. Le AIMat 434 20,541 0 10 Sep 2014
Microsoft COCO: Common Objects in Context Nayeon Lee Michael Maire Serge J. Belongie Lubomir Bourdev Ross B. Girshick James Hays Pietro Perona Deva Ramanan C. L. Zitnick Piotr Dollár ObjD 413 43,638 0 01 May 2014