ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.08832
  4. Cited By
Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to
  Enhance Visio-Linguistic Compositional Understanding

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding

15 June 2023
Le Zhang
Rabiul Awal
Aishwarya Agrawal
    CoGe
    VLM
ArXivPDFHTML

Papers citing "Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding"

9 / 9 papers shown
Title
Decoupled Global-Local Alignment for Improving Compositional Understanding
Decoupled Global-Local Alignment for Improving Compositional Understanding
Xiaoxing Hu
Kaicheng Yang
Jianmin Wang
Haoran Xu
Ziyong Feng
Yansen Wang
VLM
165
0
0
23 Apr 2025
Object-centric Binding in Contrastive Language-Image Pretraining
Object-centric Binding in Contrastive Language-Image Pretraining
Rim Assouel
Pietro Astolfi
Florian Bordes
M. Drozdzal
Adriana Romero Soriano
OCL
VLM
CoGe
103
0
0
19 Feb 2025
Why is Winoground Hard? Investigating Failures in Visuolinguistic
  Compositionality
Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality
Anuj Diwan
Layne Berry
Eunsol Choi
David Harwath
Kyle Mahowald
CoGe
111
41
0
01 Nov 2022
Fine-grained Image Captioning with CLIP Reward
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
CLIP
131
76
0
26 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,171
0
28 Jan 2022
Understanding and Improving Robustness of Vision Transformers through
  Patch-based Negative Augmentation
Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation
Yao Qin
Chiyuan Zhang
Ting Chen
Balaji Lakshminarayanan
Alex Beutel
Xuezhi Wang
ViT
50
43
0
15 Oct 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
348
2,279
0
02 Sep 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
337
3,720
0
11 Feb 2021
A weakly supervised adaptive triplet loss for deep metric learning
A weakly supervised adaptive triplet loss for deep metric learning
Xiaonan Zhao
Huan Qi
R. Luo
Larry S. Davis
DML
35
24
0
27 Sep 2019
1