Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder

15 November 2023

Papers citing "Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder"

7 / 7 papers shown

Title
JEEM: Vision-Language Understanding in Four Arabic Dialects Karima Kadaoui Hanin Atwany Hamdan Al-Ali Abdelrahman Mohamed Ali Mekky Sergei Tilga Natalia Fedorova Ekaterina Artemova Hanan Aldarmaki Yova Kementchedjhieva VLM 51 1 0 27 Mar 2025
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis Uri Berger Gabriel Stanovsky Omri Abend Lea Frermann 35 0 0 09 Aug 2024
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic Fakhraddin Alwajih Gagan Bhatia Muhammad Abdul-Mageed 37 5 0 25 Jul 2024
Image captioning in different languages Emiel van Miltenburg VLM 39 0 0 31 May 2024
Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks Fakhraddin Alwajih El Moatez Billah Nagoudi Gagan Bhatia Abdelrahman Mohamed Muhammad Abdul-Mageed VLM LRM 35 11 0 01 Mar 2024
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 319 11,953 0 04 Mar 2022
From Show to Tell: A Survey on Deep Learning-based Image Captioning Matteo Stefanini Marcella Cornia Lorenzo Baraldi S. Cascianelli G. Fiameni Rita Cucchiara 3DV VLM MLLM 67 254 0 14 Jul 2021