Improving Visual Commonsense in Language Models via Multiple Image Generation

19 June 2024

Papers citing "Improving Visual Commonsense in Language Models via Multiple Image Generation"

6 / 6 papers shown

Title
The Falcon Series of Open Language Models Ebtesam Almazrouei Hamza Alobeidli Abdulaziz Alshamsi Alessandro Cappelli Ruxandra-Aimée Cojocaru ... Quentin Malartic Daniele Mazzotta Badreddine Noune B. Pannier Guilherme Penedo AI4TS ALM 121 404 0 28 Nov 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 308 4,261 0 30 Jan 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 363 12,003 0 04 Mar 2022
Transferring Knowledge from Vision to Language: How to Achieve it and how to Measure it? Tobias Norlund Lovisa Hagström Richard Johansson 32 25 0 23 Sep 2021
Does Vision-and-Language Pretraining Improve Lexical Grounding? Tian Yun Chen Sun Ellie Pavlick VLM CoGe 40 30 0 21 Sep 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 282 1,996 0 31 Dec 2020