Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.15020
Cited By
Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO
30 April 2020
Zarana Parekh
Jason Baldridge
Daniel Cer
Austin Waters
Yinfei Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO"
14 / 14 papers shown
Title
Visualizing Dialogues: Enhancing Image Selection through Dialogue Understanding with Large Language Models
Chang-Sheng Kao
Yun-Nung Chen
23
0
0
04 Jul 2024
EDIS: Entity-Driven Image Search over Multimodal Web Content
Siqi Liu
Weixi Feng
Tsu-jui Fu
Wenhu Chen
Luu Anh Tuan
VLM
48
9
0
23 May 2023
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim
Nam-Won Kim
Suha Kwak
24
37
0
30 Nov 2022
MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation
Jiazhan Feng
Qingfeng Sun
Can Xu
Pu Zhao
Yaming Yang
Chongyang Tao
Dongyan Zhao
Qingwei Lin
32
52
0
10 Nov 2022
Image-Text Retrieval with Binary and Continuous Label Supervision
Zheng Li
Caili Guo
Zerun Feng
Lei Li
Ying Jin
Yufeng Zhang
VLM
32
4
0
20 Oct 2022
ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-training
Bin Shan
Weichong Yin
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
VLM
27
19
0
30 Sep 2022
ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO
Sanghyuk Chun
Wonjae Kim
Song Park
Minsuk Chang
Seong Joon Oh
VLM
373
43
0
07 Apr 2022
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
...
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
VLM
43
88
0
14 Feb 2022
Creating User Interface Mock-ups from High-Level Text Descriptions with Deep-Learning Models
Forrest Huang
Gang Li
Xin Zhou
John F. Canny
Yang Li
DiffM
31
19
0
14 Oct 2021
PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior for Joint Image-Text Modeling
Xiaoxue Zang
Lijuan Liu
Maria Wang
Yang Song
Hao Zhang
Jindong Chen
VLM
35
55
0
06 Jul 2021
Adversarial Text-to-Image Synthesis: A Review
Stanislav Frolov
Tobias Hinz
Federico Raue
Jörn Hees
Andreas Dengel
EGVM
22
175
0
25 Jan 2021
Cross-Modal Contrastive Learning for Text-to-Image Generation
Han Zhang
Jing Yu Koh
Jason Baldridge
Honglak Lee
Yinfei Yang
GAN
22
355
0
12 Jan 2021
Text-to-Image Generation Grounded by Fine-Grained User Attention
Jing Yu Koh
Jason Baldridge
Honglak Lee
Yinfei Yang
DiffM
27
58
0
07 Nov 2020
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Alexander Ku
Peter Anderson
Roma Patel
Eugene Ie
Jason Baldridge
43
301
0
15 Oct 2020
1