ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.00387
  4. Cited By
Say As You Wish: Fine-grained Control of Image Caption Generation with
  Abstract Scene Graphs

Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

1 March 2020
Shizhe Chen
Qin Jin
Peng Wang
Qi Wu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs"

50 / 86 papers shown
Title
From Data to Modeling: Fully Open-vocabulary Scene Graph Generation
From Data to Modeling: Fully Open-vocabulary Scene Graph Generation
Zuyao Chen
Jinlin Wu
Zhen Lei
Chang Wen Chen
53
0
0
26 May 2025
Detailed Object Description with Controllable Dimensions
Detailed Object Description with Controllable Dimensions
Xinran Wang
Hao Zhang
Baoteng Li
Kongming Liang
Hao Sun
Zhongjiang He
Zejun Ma
Jun Guo
161
1
0
28 Nov 2024
ChartKG: A Knowledge-Graph-Based Representation for Chart Images
ChartKG: A Knowledge-Graph-Based Representation for Chart Images
Zhiguang Zhou
Haoxuan Wang
Zhengqing Zhao
Fengling Zheng
Yongheng Wang
Wei Chen
Yong Wang
137
1
0
13 Oct 2024
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving
  Fine-Grained Zero-Shot Image Captioning
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
Joshua Forster Feinglass
Yezhou Yang
67
0
0
30 Sep 2024
Pixels to Prose: Understanding the art of Image Captioning
Pixels to Prose: Understanding the art of Image Captioning
Hrishikesh Singh
Aarti Sharma
Millie Pant
3DVVLM
90
1
0
28 Aug 2024
Fine-grained length controllable video captioning with ordinal
  embeddings
Fine-grained length controllable video captioning with ordinal embeddings
Tomoya Nitta
Takumi Fukuzawa
Toru Tamaki
101
0
0
27 Aug 2024
Ensemble Predicate Decoding for Unbiased Scene Graph Generation
Ensemble Predicate Decoding for Unbiased Scene Graph Generation
Jiasong Feng
Lichun Wang
Hongbo Xu
Kai Xu
Baocai Yin
87
0
0
26 Aug 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
75
0
0
09 Aug 2024
OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal
  Models
OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models
Zijian Zhou
Zheng Zhu
Holger Caesar
Miaojing Shi
VLM
100
3
0
15 Jul 2024
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
Weitai Kang
Mengxue Qu
Yunchao Wei
Yan Yan
113
6
0
03 Jul 2024
Visual Grounding with Attention-Driven Constraint Balancing
Visual Grounding with Attention-Driven Constraint Balancing
Weitai Kang
Luowei Zhou
Junyi Wu
Changchang Sun
Yan Yan
80
4
0
03 Jul 2024
SegVG: Transferring Object Bounding Box to Segmentation for Visual
  Grounding
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang
Gaowen Liu
Mubarak Shah
Yan Yan
ObjD
121
9
0
03 Jul 2024
A look under the hood of the Interactive Deep Learning Enterprise
  (No-IDLE)
A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE)
Daniel Sonntag
Michael Barz
Thiago S. Gouvêa
VLM
112
4
0
27 Jun 2024
Exploring the Distinctiveness and Fidelity of the Descriptions Generated
  by Large Vision-Language Models
Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models
Yuhang Huang
Zihan Wu
Chongyang Gao
Jiawei Peng
Xu Yang
75
2
0
26 Apr 2024
Cycle-Consistency Learning for Captioning and Grounding
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang
Jiajun Deng
Mingbo Jia
ObjD
96
8
0
23 Dec 2023
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Rishabh Kabra
Loic Matthey
Alexander Lerchner
Niloy J. Mitra
113
6
0
29 Nov 2023
VLPrompt: Vision-Language Prompting for Panoptic Scene Graph Generation
VLPrompt: Vision-Language Prompting for Panoptic Scene Graph Generation
Zijian Zhou
Miaojing Shi
Holger Caesar
VLM
125
12
0
27 Nov 2023
Generating Human-Centric Visual Cues for Human-Object Interaction
  Detection via Large Vision-Language Models
Generating Human-Centric Visual Cues for Human-Object Interaction Detection via Large Vision-Language Models
Yu-Wei Zhan
Fan Liu
Xin Luo
Liqiang Nie
Xin-Shun Xu
Mohan S. Kankanhalli
VLM
94
2
0
26 Nov 2023
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph
  Generation via Visual-Concept Alignment and Retention
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
Zuyao Chen
Jinlin Wu
Zhen Lei
Zhaoxiang Zhang
Changwen Chen
116
18
0
18 Nov 2023
Improving Image Captioning via Predicting Structured Concepts
Improving Image Captioning via Predicting Structured Concepts
Ting Wang
Weidong Chen
Yuanhe Tian
Yan Song
Zhendong Mao
88
8
0
14 Nov 2023
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with
  Hierarchical Semantic Graphs
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs
Peng Jin
Yang Wu
Yanbo Fan
Zhongqian Sun
Yang Wei
Li-ming Yuan
DiffM
98
31
0
02 Nov 2023
Towards Complex-query Referring Image Segmentation: A Novel Benchmark
Towards Complex-query Referring Image Segmentation: A Novel Benchmark
Wei Ji
Li Li
Marco Pleines
Xiangyan Liu
Xu Yang
Juncheng Billy Li
Roger Zimmermann
70
8
0
29 Sep 2023
Zero-Shot Scene Graph Generation via Triplet Calibration and Reduction
Zero-Shot Scene Graph Generation via Triplet Calibration and Reduction
Jiankai Li
Yunhong Wang
Weixin Li
107
2
0
07 Sep 2023
Head-Tail Cooperative Learning Network for Unbiased Scene Graph
  Generation
Head-Tail Cooperative Learning Network for Unbiased Scene Graph Generation
Lei Wang
Zejian Yuan
Yao Lu
Badong Chen
82
0
0
23 Aug 2023
Explore and Tell: Embodied Visual Captioning in 3D Environments
Explore and Tell: Embodied Visual Captioning in 3D Environments
Anwen Hu
Shizhe Chen
Liang Zhang
Qin Jin
LM&Ro
85
2
0
21 Aug 2023
Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic
  Role Labeling
Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling
Yu Zhao
Hao Fei
Yixin Cao
Bobo Li
Meishan Zhang
Jianguo Wei
Hao Fei
Tat-Seng Chua
90
17
0
09 Aug 2023
Environment-Invariant Curriculum Relation Learning for Fine-Grained
  Scene Graph Generation
Environment-Invariant Curriculum Relation Learning for Fine-Grained Scene Graph Generation
Yu Min
Aming Wu
Cheng Deng
96
7
0
07 Aug 2023
Panoptic Scene Graph Generation with Semantics-Prototype Learning
Panoptic Scene Graph Generation with Semantics-Prototype Learning
Li Li
Wei Ji
Yiming Wu
Meng Li
Youxuan Qin
Lina Wei
Roger Zimmermann
100
38
0
28 Jul 2023
Pair then Relation: Pair-Net for Panoptic Scene Graph Generation
Pair then Relation: Pair-Net for Panoptic Scene Graph Generation
Jinghao Wang
Zhengyu Wen
Xiangtai Li
Zujin Guo
Jingkang Yang
Ziwei Liu
95
18
0
17 Jul 2023
Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition
Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition
Qiulei Dong
Hong Wang
Qiulei Dong
106
0
0
14 Jul 2023
Improving Reference-based Distinctive Image Captioning with Contrastive
  Rewards
Improving Reference-based Distinctive Image Captioning with Contrastive Rewards
Yangjun Mao
Jun Xiao
Dong Zhang
Meng Cao
Jian Shao
Yueting Zhuang
Long Chen
EGVM
76
9
0
25 Jun 2023
Interactive and Explainable Region-guided Radiology Report Generation
Interactive and Explainable Region-guided Radiology Report Generation
Tim Tanida
Philip Muller
Georgios Kaissis
Daniel Rueckert
MedIm
148
117
0
17 Apr 2023
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic
  Scene Graph Generation
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation
Zijian Zhou
Miaojing Shi
Holger Caesar
96
20
0
28 Mar 2023
Multi-modal reward for visual relationships-based image captioning
Multi-modal reward for visual relationships-based image captioning
Ali Abedi
Hossein Karshenas
Peyman Adibi
133
2
0
19 Mar 2023
Learning Combinatorial Prompts for Universal Controllable Image
  Captioning
Learning Combinatorial Prompts for Universal Controllable Image Captioning
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
112
5
0
11 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based
  Polishing
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDLDiffM
61
33
0
04 Mar 2023
HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval
HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval
Jie Guo
Meiting Wang
Yan Zhou
Bin Song
Yuhao Chi
Wei-liang Fan
Jianglong Chang
78
16
0
16 Dec 2022
Controllable Image Captioning via Prompting
Controllable Image Captioning via Prompting
Ning Wang
Jiahao Xie
Jihao Wu
Mingbo Jia
Linlin Li
66
24
0
04 Dec 2022
SGDraw: Scene Graph Drawing Interface Using Object-Oriented
  Representation
SGDraw: Scene Graph Drawing Interface Using Object-Oriented Representation
Tianyu Zhang
Xu Du
Chia-Ming Chang
Xi Yang
H. Xie
65
0
0
30 Nov 2022
CLID: Controlled-Length Image Descriptions with Limited Data
CLID: Controlled-Length Image Descriptions with Limited Data
Elad Hirsch
A. Tal
VLM3DV
60
4
0
27 Nov 2022
Visual Semantic Parsing: From Images to Abstract Meaning Representation
Visual Semantic Parsing: From Images to Abstract Meaning Representation
M. A. Abdelsalam
Zhan Shi
Federico Fancellu
Kalliopi Basioti
Dhaivat Bhatt
Vladimir Pavlovic
Afsaneh Fazly
GNN
94
4
0
26 Oct 2022
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text
  Generation
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation
Yu Zhao
Jianguo Wei
Zhichao Lin
Yueheng Sun
Meishan Zhang
Hao Fei
79
16
0
20 Oct 2022
What Should the System Do Next?: Operative Action Captioning for
  Estimating System Actions
What Should the System Do Next?: Operative Action Captioning for Estimating System Actions
Taiki Nakamura
Seiya Kawano
Akishige Yuguchi
Yasutomo Kawanishi
Koichiro Yoshino
114
0
0
06 Oct 2022
A Survey on Graph Neural Networks and Graph Transformers in Computer
  Vision: A Task-Oriented Perspective
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective
Chaoqi Chen
Yushuang Wu
Qiyuan Dai
Hong-Yu Zhou
Mutian Xu
Sibei Yang
Xiaoguang Han
Yizhou Yu
ViTMedImAI4CE
137
82
0
27 Sep 2022
Learning Distinct and Representative Styles for Image Captioning
Learning Distinct and Representative Styles for Image Captioning
Qi Chen
Chaorui Deng
Qi Wu
VLM
79
24
0
17 Sep 2022
Belief Revision based Caption Re-ranker with Visual Semantic Information
Belief Revision based Caption Re-ranker with Visual Semantic Information
Ahmed Sabir
Francesc Moreno-Noguer
Pranava Madhyastha
Lluís Padró
BDL
74
2
0
16 Sep 2022
Panoptic Scene Graph Generation
Panoptic Scene Graph Generation
Jingkang Yang
Yi Zhe Ang
Zujin Guo
Kaiyang Zhou
Wayne Zhang
Ziwei Liu
152
114
0
22 Jul 2022
Rethinking the Reference-based Distinctive Image Captioning
Rethinking the Reference-based Distinctive Image Captioning
Yangjun Mao
Long Chen
Zhihong Jiang
Dong Zhang
Zhimeng Zhang
Jian Shao
Jun Xiao
DiffM
89
22
0
22 Jul 2022
Improving Image Captioning with Control Signal of Sentence Quality
Improving Image Captioning with Control Signal of Sentence Quality
Zhangzi Zhu
Hong Qu
83
0
0
07 Jun 2022
Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
Mingjie Li
Wenjia Cai
Karin Verspoor
Shirui Pan
Xiaodan Liang
Xiaojun Chang
MedIm
88
38
0
04 Jun 2022
12
Next