Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.11699
Cited By
MISAR: A Multimodal Instructional System with Augmented Reality
18 October 2023
Jing Bi
Nguyen Nguyen
A. Vosoughi
Chenliang Xu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MISAR: A Multimodal Instructional System with Augmented Reality"
6 / 6 papers shown
Title
"I Can See Forever!": Evaluating Real-time VideoLLMs for Assisting Individuals with Visual Impairments
Zhe Zhang
Zhen Sun
Zhenru Zhang
Zifan Peng
Yuemeng Zhao
Zhilin Wang
Zeren Luo
Ruiting Zuo
Xinlei He
42
0
0
07 May 2025
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
Yunlong Tang
Jing Bi
Chao Huang
Susan Liang
Daiki Shimada
...
Jinxi He
Liu He
Zeliang Zhang
Jiebo Luo
Chenliang Xu
37
0
0
07 Apr 2025
Llamarine: Open-source Maritime Industry-specific Large Language Model
William Nguyen
An Phan
Konobu Kimura
Hitoshi Maeno
Mika Tanaka
Quynh Le
William Poucher
Christopher Nguyen
LRM
33
0
0
28 Feb 2025
A Closer Look at Weakly-Supervised Audio-Visual Source Localization
Shentong Mo
Pedro Morgado
83
64
0
30 Aug 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
244
1,024
0
13 Oct 2021
1