ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.13437
  4. Cited By
Seeing What You Miss: Vision-Language Pre-training with Semantic
  Completion Learning
v1v2 (latest)

Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning

24 November 2022
Yatai Ji
Rong-Cheng Tu
Jie Jiang
Weijie Kong
Chengfei Cai
Wenzhe Zhao
Hongfa Wang
Yujiu Yang
Wei Liu
    VLM
ArXiv (abs)PDFHTML

Papers citing "Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning"

11 / 11 papers shown
Title
MLLM-Guided VLM Fine-Tuning with Joint Inference for Zero-Shot Composed Image Retrieval
MLLM-Guided VLM Fine-Tuning with Joint Inference for Zero-Shot Composed Image Retrieval
Rong-Cheng Tu
Zhao Jin
Jingyi Liao
Xiao Luo
Yingjie Wang
Li Shen
Dacheng Tao
117
0
0
26 May 2025
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Zhao Jin
Dacheng Tao
VGen
259
1
0
16 Dec 2024
IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language
  Model
IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model
Yatai Ji
Shilong Zhang
Jie Wu
Peize Sun
Weifeng Chen
Xuefeng Xiao
Sidi Yang
Yanting Yang
Ping Luo
VLM
80
4
0
10 Jul 2024
One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal
  Multi-scale and Action Label Features
One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features
Trung Thanh Nguyen
Yasutomo Kawanishi
Takahiro Komamizu
Ichiro Ide
VLM
72
3
0
30 Apr 2024
VISLA Benchmark: Evaluating Embedding Sensitivity to Semantic and
  Lexical Alterations
VISLA Benchmark: Evaluating Embedding Sensitivity to Semantic and Lexical Alterations
Sri Harsha Dumpala
Aman Jaiswal
Chandramouli Shama Sastry
E. Milios
Sageev Oore
Hassan Sajjad
VLMCoGe
113
0
0
25 Apr 2024
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Jiamian Wang
Guohao Sun
Pichao Wang
Dongfang Liu
S. Dianat
Majid Rabbani
Raghuveer M. Rao
Zhiqiang Tao
VGen
124
26
0
26 Mar 2024
Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language
  Pre-training
Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training
Haowei Liu
Yaya Shi
Haiyang Xu
Chunfen Yuan
Qinghao Ye
...
Mingshi Yan
Ji Zhang
Fei Huang
Bing Li
Weiming Hu
VLM
96
0
0
01 Mar 2024
Masked Contrastive Reconstruction for Cross-modal Medical Image-Report
  Retrieval
Masked Contrastive Reconstruction for Cross-modal Medical Image-Report Retrieval
Zeqiang Wei
Kai Jin
Xiuzhuang Zhou
MedIm
72
5
0
26 Dec 2023
TiMix: Text-aware Image Mixing for Effective Vision-Language
  Pre-training
TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training
Chaoya Jiang
Wei Ye
Haiyang Xu
Qinghao Ye
Mingshi Yan
Ji Zhang
Shikun Zhang
CLIPVLM
59
4
0
14 Dec 2023
SA-Attack: Improving Adversarial Transferability of Vision-Language
  Pre-training Models via Self-Augmentation
SA-Attack: Improving Adversarial Transferability of Vision-Language Pre-training Models via Self-Augmentation
Bangyan He
Xiaojun Jia
Siyuan Liang
Tianrui Lou
Yang Liu
Xiaochun Cao
AAMLVLM
113
29
0
08 Dec 2023
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
Yi-Syuan Chen
Yun-Zhu Song
Cheng Yu Yeo
Bei Liu
Jianlong Fu
Hong-Han Shuai
VLMLRM
94
4
0
15 Jul 2023
1