Mirasol3B: A Multimodal Autoregressive model for time-aligned and
  contextual modalities
v1v2v3 (latest)

Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Papers citing "Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities"

15 / 15 papers shown
Title
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
Guangzhi Sun
Wenyi Yu
Changli Tang
Xianzhao Chen
Tian Tan
Wei Li
Lu Lu
Zejun Ma
Yuxuan Wang
Chao Zhang
97
35
0
22 Jun 2024

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.