Kosmos-G: Generating Images in Context with Multimodal Large Language
  Models
v1v2v3 (latest)

Kosmos-G: Generating Images in Context with Multimodal Large Language Models

    VLM

Papers citing "Kosmos-G: Generating Images in Context with Multimodal Large Language Models"

50 / 57 papers shown
Title
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Lingwei Meng
Shujie Hu
Jiawen Kang
Zhaoqing Li
Yuejiao Wang
Wenxuan Wu
Xixin Wu
Xunying Liu
Helen Meng
178
6
0
13 Sep 2024

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.