Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.13430
Cited By
Resolving References in Visually-Grounded Dialogue via Text Generation
23 September 2023
Bram Willemsen
Livia Qian
Gabriel Skantze
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Resolving References in Visually-Grounded Dialogue via Text Generation"
5 / 5 papers shown
Title
Multimodal Coreference Resolution for Chinese Social Media Dialogues: Dataset and Benchmark Approach
Xingyu Li
Chen Gong
Guohong Fu
VGen
29
0
0
19 Apr 2025
Referring Expression Generation in Visually Grounded Dialogue with Discourse-aware Comprehension Guiding
Bram Willemsen
Gabriel Skantze
30
0
0
09 Sep 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
314
4,261
0
30 Jan 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,154
0
28 Jan 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
337
3,708
0
11 Feb 2021
1