ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.07689
  4. Cited By
Expressing Visual Relationships via Language

Expressing Visual Relationships via Language

18 June 2019
Hao Tan
Franck Dernoncourt
Zhe-nan Lin
Trung Bui
Joey Tianyi Zhou
ArXivPDFHTML

Papers citing "Expressing Visual Relationships via Language"

13 / 13 papers shown
Title
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Kaihang Pan
Wang Lin
Zhongqi Yue
Tenglong Ao
Liyu Jia
Wei Zhao
Juncheng Billy Li
Siliang Tang
Hanwang Zhang
58
2
0
20 Apr 2025
Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization
Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization
Iñigo Pikabea
Iñaki Lacunza
Oriol Pareras
Carlos Escolano
Aitor Gonzalez-Agirre
Javier Hernando
Marta Villegas
VLM
58
0
0
28 Mar 2025
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language
  Models
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models
Qirui Jiao
Daoyuan Chen
Yilun Huang
Yaliang Li
Ying Shen
VLM
45
5
0
08 Aug 2024
Distractors-Immune Representation Learning with Cross-modal Contrastive
  Regularization for Change Captioning
Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning
Yunbin Tu
Liang-Sheng Li
Li Su
Chenggang Yan
Qin Huang
50
5
0
16 Jul 2024
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu
Hao Fei
Xiangtai Li
Jiayi Ji
Hanwang Zhang
Tat-Seng Chua
Shuicheng Yan
MLLM
65
32
0
07 Jun 2024
Self-supervised Cross-view Representation Reconstruction for Change
  Captioning
Self-supervised Cross-view Representation Reconstruction for Change Captioning
Yunbin Tu
Liang Li
Filippos Christianos
Zheng-Jun Zha
Zhibin Li
Qingming Huang
SSL
29
24
0
28 Sep 2023
Visual Instruction Tuning with Polite Flamingo
Visual Instruction Tuning with Polite Flamingo
Delong Chen
Jianfeng Liu
Wenliang Dai
Baoyuan Wang
MLLM
36
42
0
03 Jul 2023
Neighborhood Contrastive Transformer for Change Captioning
Neighborhood Contrastive Transformer for Change Captioning
Yunbin Tu
Liang Li
Li Su
Kelvin Lu
Qin Huang
ViT
24
14
0
06 Mar 2023
CLIP4IDC: CLIP for Image Difference Captioning
CLIP4IDC: CLIP for Image Difference Captioning
Zixin Guo
T. Wang
Jorma T. Laaksonen
VLM
29
27
0
01 Jun 2022
Spot the Difference: A Cooperative Object-Referring Game in
  Non-Perfectly Co-Observable Scene
Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Duo Zheng
Fandong Meng
Q. Si
Hairun Fan
Zipeng Xu
Jie Zhou
Fangxiang Feng
Xiaojie Wang
27
0
0
16 Mar 2022
CAISE: Conversational Agent for Image Search and Editing
CAISE: Conversational Agent for Image Search and Editing
Hyounghun Kim
Doo Soon Kim
Seunghyun Yoon
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
27
6
0
24 Feb 2022
R$^3$Net:Relation-embedded Representation Reconstruction Network for
  Change Captioning
R3^33Net:Relation-embedded Representation Reconstruction Network for Change Captioning
Yunbin Tu
Liang Li
C. Yan
Shengxiang Gao
Zhengtao Yu
35
22
0
20 Oct 2021
Neural Naturalist: Generating Fine-Grained Image Comparisons
Neural Naturalist: Generating Fine-Grained Image Comparisons
Maxwell Forbes
Christine Kaeser-Chen
Piyush Sharma
Serge J. Belongie
VLM
64
56
0
09 Sep 2019
1