Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.05078
Cited By
CLIP-Event: Connecting Text and Images with Event Structures
13 January 2022
Manling Li
Ruochen Xu
Shuohang Wang
Luowei Zhou
Xudong Lin
Chenguang Zhu
Michael Zeng
Heng Ji
Shih-Fu Chang
VLM
CLIP
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CLIP-Event: Connecting Text and Images with Event Structures"
15 / 65 papers shown
Title
Event Extraction: A Survey
Viet Dac Lai
23
9
0
07 Oct 2022
Ambiguous Images With Human Judgments for Robust Visual Event Classification
Kate Sanders
Reno Kriz
Anqi Liu
Benjamin Van Durme
65
12
0
06 Oct 2022
GAMA: Generative Adversarial Multi-Object Scene Attacks
Abhishek Aich
Calvin-Khang Ta
Akash Gupta
Chengyu Song
S. Krishnamurthy
M. Salman Asif
A. Roy-Chowdhury
AAML
51
17
0
20 Sep 2022
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
18
60
0
07 Sep 2022
Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
Haoxuan You
Luowei Zhou
Bin Xiao
Noel Codella
Yu Cheng
Ruochen Xu
Shih-Fu Chang
Lu Yuan
CLIP
VLM
24
48
0
26 Jul 2022
Probing Visual-Audio Representation for Video Highlight Detection via Hard-Pairs Guided Contrastive Learning
Shuaicheng Li
Feng Zhang
Kunlin Yang
Lin-Na Liu
Shinan Liu
Jun Hou
Shuai Yi
45
8
0
21 Jun 2022
Beyond Grounding: Extracting Fine-Grained Event Hierarchies Across Modalities
Hammad A. Ayyubi
Christopher Thomas
Lovish Chum
R. Lokesh
Long Chen
...
Xudong Lin
Xuande Feng
Jaywon Koo
Sounak Ray
Shih-Fu Chang
AI4TS
31
0
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
72
527
0
13 Jun 2022
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval
Xudong Lin
Simran Tiwari
Shiyuan Huang
Manling Li
Mike Zheng Shou
Heng Ji
Shih-Fu Chang
30
20
0
05 Jun 2022
Translation between Molecules and Natural Language
Carl Edwards
T. Lai
Kevin Ros
Garrett Honke
Kyunghyun Cho
Heng Ji
33
157
0
25 Apr 2022
Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting
G. Han
Long Chen
Jiawei Ma
Shiyuan Huang
Ramalingam Chellappa
Shih-Fu Chang
VLM
32
20
0
16 Apr 2022
Deep Multi-Modal Structural Equations For Causal Effect Estimation With Unstructured Proxies
Shachi Deshpande
Kaiwen Wang
Dhruv Sreenivas
Zheng Li
Volodymyr Kuleshov
CML
SyDa
16
11
0
18 Mar 2022
Leveraging Visual Knowledge in Language Tasks: An Empirical Study on Intermediate Pre-training for Cross-modal Knowledge Transfer
Woojeong Jin
Dong-Ho Lee
Chenguang Zhu
Jay Pujara
Xiang Ren
CLIP
VLM
11
9
0
14 Mar 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
322
3,708
0
11 Feb 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
Previous
1
2