Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1812.02378
Cited By
Auto-Encoding Scene Graphs for Image Captioning
6 December 2018
Xu Yang
Kaihua Tang
Hanwang Zhang
Jianfei Cai
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Auto-Encoding Scene Graphs for Image Captioning"
50 / 146 papers shown
Title
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Lakshita Agarwal
Bindu Verma
ViT
29
0
0
23 Apr 2025
A Causal Adjustment Module for Debiasing Scene Graph Generation
Li Liu
Shuzhou Sun
Shuaifeng Zhi
Fan Shi
Zhen Liu
J. Heikkilä
Yongxiang Liu
CML
59
2
0
22 Mar 2025
Disentangling Fine-Tuning from Pre-Training in Visual Captioning with Hybrid Markov Logic
Monika Shah
Somdeb Sarkhel
Deepak Venugopal
MLLM
BDL
VLM
85
0
0
18 Mar 2025
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffM
VLM
53
0
0
03 Jan 2025
Situational Scene Graph for Structured Human-centric Situation Understanding
Chinthani Sugandhika
Chen Li
Deepu Rajan
Basura Fernando
206
1
0
30 Oct 2024
BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation
Peng Hao
Xiaobing Wang
Yingying Jiang
Hanchao Jia
Xiaoshuai Hao
Shaowei Cui
Junhang Wei
Xiaoshuai Hao
57
3
0
26 Jul 2024
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu
Hao Fei
Xiangtai Li
Jiayi Ji
Hanwang Zhang
Tat-Seng Chua
Shuicheng Yan
MLLM
65
32
0
07 Jun 2024
Semi-Supervised Image Captioning Considering Wasserstein Graph Matching
Yang Yang
41
0
0
26 Mar 2024
A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes
Ting Yu
Xiaojun Lin
Shuhui Wang
Weiguo Sheng
Qingming Huang
Jun-chen Yu
3DV
54
10
0
12 Mar 2024
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng
Yan Xie
Hao Zhang
Chiyu Chen
Zhengjue Wang
Boli Chen
VLM
39
14
0
06 Mar 2024
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Ivan Rodin
Antonino Furnari
Kyle Min
Subarna Tripathi
G. Farinella
EgoV
27
12
0
06 Dec 2023
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
Zuyao Chen
Jinlin Wu
Zhen Lei
Zhaoxiang Zhang
Changwen Chen
33
11
0
18 Nov 2023
Predicate Classification Using Optimal Transport Loss in Scene Graph Generation
Sorachi Kurita
Satoshi Oyama
Itsuki Noda
OT
32
0
0
19 Sep 2023
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
55
19
0
23 Aug 2023
The Expressive Power of Graph Neural Networks: A Survey
Bingxue Zhang
Changjun Fan
Shixuan Liu
Kuihua Huang
Xiang Zhao
Jin-Yu Huang
Zhong Liu
40
19
0
16 Aug 2023
Improving Scene Graph Generation with Superpixel-Based Interaction Learning
Jingyi Wang
Can Zhang
Jinfa Huang
Bo Ren
Zhidong Deng
25
7
0
04 Aug 2023
Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Bang-ju Yang
Fenglin Liu
Zheng Li
Qingyu Yin
Chenyu You
Bing Yin
Yuexian Zou
VLM
36
5
0
05 Jul 2023
Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
Emanuele Bugliarello
Aida Nematzadeh
Lisa Anne Hendricks
SSL
30
5
0
23 May 2023
A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
32
6
0
20 May 2023
Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling
Shengqiong Wu
Hao Fei
Yixin Cao
Lidong Bing
Tat-Seng Chua
39
31
0
19 May 2023
Textual Explanations for Automated Commentary Driving
Marc Alexander Kühn
Daniel Omeiza
Lars Kunze
30
6
0
12 Apr 2023
SPAN: Learning Similarity between Scene Graphs and Images with Transformers
Yuren Cong
Wentong Liao
Bodo Rosenhahn
M. Yang
35
6
0
02 Apr 2023
Location-Free Scene Graph Generation
Ege Özsoy
Felix Holm
Tobias Czempiel
Tobias Czempiel
Benjamin Busam
Nassir Navab
Benjamin Busam
50
4
0
20 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
Yangqiaoyu Zhou
Kai-Lang Yao
Wusuo Li
MedIm
19
1
0
17 Mar 2023
Knowledge-augmented Few-shot Visual Relation Detection
Tianyu Yu
Yong Li
Jiaoyan Chen
Hai-Tao Zheng
Haitao Zheng
...
Qingbin Liu
Wenqiang Liu
Dongxiao Huang
Bei Wu
Yexin Wang
55
6
0
09 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
33
14
0
07 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDL
DiffM
19
33
0
04 Mar 2023
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Rongrong Ji
ViT
21
71
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
31
4
0
08 Feb 2023
SrTR: Self-reasoning Transformer with Visual-linguistic Knowledge for Scene Graph Generation
Yuxiang Zhang
Zhenbo Liu
Shuai Wang
ReLM
LRM
34
1
0
19 Dec 2022
SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Feiqi Cao
Siwen Luo
F. Núñez
Zean Wen
Josiah Poon
Caren Han
GNN
26
4
0
16 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
30
62
0
06 Dec 2022
Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs
Osman Ulger
Julian Wiederer
Mohsen Ghafoorian
Vasileios Belagiannis
Pascal Mettes
43
0
0
06 Dec 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
29
1
0
20 Nov 2022
Probabilistic Debiasing of Scene Graphs
Bashirul Azam Biswas
Qian Ji
22
11
0
11 Nov 2022
Visual Semantic Parsing: From Images to Abstract Meaning Representation
M. A. Abdelsalam
Zhan Shi
Federico Fancellu
Kalliopi Basioti
Dhaivat Bhatt
Vladimir Pavlovic
Afsaneh Fazly
GNN
37
4
0
26 Oct 2022
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
24
46
0
19 Oct 2022
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation
Rui Li
Weihua Li
Yi Yang
Hanyu Wei
Jianhua Jiang
Quan-wei Bai
DiffM
27
11
0
18 Oct 2022
Explore Contextual Information for 3D Scene Graph Generation
Yu-An Liu
Chengjiang Long
Zhaoxuan Zhang
Bo Liu
Qiang Zhang
Baocai Yin
Xin Yang
35
10
0
12 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
40
10
0
04 Oct 2022
Unbiased Scene Graph Generation using Predicate Similarities
Misaki Ohashi
Yusuke Matsui
32
1
0
03 Oct 2022
Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning
Tao He
Lianli Gao
Jingkuan Song
Yuan-Fang Li
VLM
34
50
0
17 Aug 2022
Context-aware Mixture-of-Experts for Unbiased Scene Graph Generation
Liguang Zhou
Yuhongze Zhou
Tin Lun Lam
Yangsheng Xu
EDL
MoE
28
2
0
15 Aug 2022
Rethinking the Evaluation of Unbiased Scene Graph Generation
Xingchen Li
Long Chen
Jian Shao
Shaoning Xiao
Songyang Zhang
Jun Xiao
42
12
0
03 Aug 2022
Integrating Object-aware and Interaction-aware Knowledge for Weakly Supervised Scene Graph Generation
Xingchen Li
Long Chen
Wenbo Ma
Yi Yang
Jun Xiao
21
26
0
03 Aug 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
36
106
0
20 Jul 2022
Exploring the sequence length bottleneck in the Transformer for Image Captioning
Jiapeng Hu
Roberto Cavicchioli
Alessandro Capotondi
ViT
38
3
0
07 Jul 2022
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
26
88
0
14 Jun 2022
Visual Transformer for Object Detection
M. Yang
ViT
25
6
0
01 Jun 2022
Importance Weighted Structure Learning for Scene Graph Generation
Daqing Liu
M. Bober
J. Kittler
27
5
0
14 May 2022
1
2
3
Next