ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.00387
  4. Cited By
Say As You Wish: Fine-grained Control of Image Caption Generation with
  Abstract Scene Graphs

Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

1 March 2020
Shizhe Chen
Qin Jin
Peng Wang
Qi Wu
    DiffM
ArXivPDFHTML

Papers citing "Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs"

50 / 86 papers shown
Title
Detailed Object Description with Controllable Dimensions
Detailed Object Description with Controllable Dimensions
Xinran Wang
Hao Zhang
Baoteng Li
Kongming Liang
Hao Sun
Zhongjiang He
Zejun Ma
Jun Guo
81
1
0
28 Nov 2024
ChartKG: A Knowledge-Graph-Based Representation for Chart Images
ChartKG: A Knowledge-Graph-Based Representation for Chart Images
Zhiguang Zhou
Haoxuan Wang
Zhengqing Zhao
Fengling Zheng
Yongheng Wang
Wei Chen
Yong Wang
37
0
0
13 Oct 2024
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving
  Fine-Grained Zero-Shot Image Captioning
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
Joshua Forster Feinglass
Yezhou Yang
38
0
0
30 Sep 2024
Pixels to Prose: Understanding the art of Image Captioning
Pixels to Prose: Understanding the art of Image Captioning
Hrishikesh Singh
Aarti Sharma
Millie Pant
3DV
VLM
30
0
0
28 Aug 2024
Fine-grained length controllable video captioning with ordinal
  embeddings
Fine-grained length controllable video captioning with ordinal embeddings
Tomoya Nitta
Takumi Fukuzawa
Toru Tamaki
48
0
0
27 Aug 2024
Ensemble Predicate Decoding for Unbiased Scene Graph Generation
Ensemble Predicate Decoding for Unbiased Scene Graph Generation
Jiasong Feng
Lichun Wang
Hongbo Xu
Kai Xu
Baocai Yin
50
0
0
26 Aug 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
45
0
0
09 Aug 2024
OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal
  Models
OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models
Zijian Zhou
Zheng Zhu
Holger Caesar
Miaojing Shi
VLM
43
2
0
15 Jul 2024
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
Weitai Kang
Mengxue Qu
Yunchao Wei
Yan Yan
46
6
0
03 Jul 2024
Visual Grounding with Attention-Driven Constraint Balancing
Visual Grounding with Attention-Driven Constraint Balancing
Weitai Kang
Luowei Zhou
Junyi Wu
Changchang Sun
Yan Yan
45
4
0
03 Jul 2024
SegVG: Transferring Object Bounding Box to Segmentation for Visual
  Grounding
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang
Gaowen Liu
Mubarak Shah
Yan Yan
ObjD
41
9
0
03 Jul 2024
A look under the hood of the Interactive Deep Learning Enterprise
  (No-IDLE)
A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE)
Daniel Sonntag
Michael Barz
Thiago S. Gouvêa
VLM
52
4
0
27 Jun 2024
Exploring the Distinctiveness and Fidelity of the Descriptions Generated
  by Large Vision-Language Models
Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models
Yuhang Huang
Zihan Wu
Chongyang Gao
Jiawei Peng
Xu Yang
45
2
0
26 Apr 2024
Cycle-Consistency Learning for Captioning and Grounding
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang
Jiajun Deng
Mingbo Jia
ObjD
45
7
0
23 Dec 2023
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Rishabh Kabra
Loic Matthey
Alexander Lerchner
Niloy J. Mitra
34
6
0
29 Nov 2023
VLPrompt: Vision-Language Prompting for Panoptic Scene Graph Generation
VLPrompt: Vision-Language Prompting for Panoptic Scene Graph Generation
Zijian Zhou
Miaojing Shi
Holger Caesar
VLM
42
12
0
27 Nov 2023
Generating Human-Centric Visual Cues for Human-Object Interaction
  Detection via Large Vision-Language Models
Generating Human-Centric Visual Cues for Human-Object Interaction Detection via Large Vision-Language Models
Yu-Wei Zhan
Fan Liu
Xin Luo
Liqiang Nie
Xin-Shun Xu
Mohan S. Kankanhalli
VLM
40
2
0
26 Nov 2023
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph
  Generation via Visual-Concept Alignment and Retention
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
Zuyao Chen
Jinlin Wu
Zhen Lei
Zhaoxiang Zhang
Changwen Chen
37
11
0
18 Nov 2023
Improving Image Captioning via Predicting Structured Concepts
Improving Image Captioning via Predicting Structured Concepts
Ting Wang
Weidong Chen
Yuanhe Tian
Yan Song
Zhendong Mao
42
8
0
14 Nov 2023
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with
  Hierarchical Semantic Graphs
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs
Peng Jin
Yang Wu
Yanbo Fan
Zhongqian Sun
Yang Wei
Li-ming Yuan
DiffM
36
28
0
02 Nov 2023
Towards Complex-query Referring Image Segmentation: A Novel Benchmark
Towards Complex-query Referring Image Segmentation: A Novel Benchmark
Wei Ji
Li Li
Marco Pleines
Xiangyan Liu
Xu Yang
Juncheng Billy Li
Roger Zimmermann
37
8
0
29 Sep 2023
Zero-Shot Scene Graph Generation via Triplet Calibration and Reduction
Zero-Shot Scene Graph Generation via Triplet Calibration and Reduction
Jiankai Li
Yunhong Wang
Weixin Li
38
2
0
07 Sep 2023
Head-Tail Cooperative Learning Network for Unbiased Scene Graph
  Generation
Head-Tail Cooperative Learning Network for Unbiased Scene Graph Generation
Lei Wang
Zejian Yuan
Yao Lu
Badong Chen
40
0
0
23 Aug 2023
Explore and Tell: Embodied Visual Captioning in 3D Environments
Explore and Tell: Embodied Visual Captioning in 3D Environments
Anwen Hu
Shizhe Chen
Liang Zhang
Qin Jin
LM&Ro
45
2
0
21 Aug 2023
Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic
  Role Labeling
Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling
Yu Zhao
Hao Fei
Yixin Cao
Bobo Li
Meishan Zhang
Jianguo Wei
Hao Fei
Tat-Seng Chua
29
13
0
09 Aug 2023
Environment-Invariant Curriculum Relation Learning for Fine-Grained
  Scene Graph Generation
Environment-Invariant Curriculum Relation Learning for Fine-Grained Scene Graph Generation
Yu Min
Aming Wu
Cheng Deng
35
6
0
07 Aug 2023
Panoptic Scene Graph Generation with Semantics-Prototype Learning
Panoptic Scene Graph Generation with Semantics-Prototype Learning
Li Li
Wei Ji
Yiming Wu
Meng Li
Youxuan Qin
Lina Wei
Roger Zimmermann
39
35
0
28 Jul 2023
Pair then Relation: Pair-Net for Panoptic Scene Graph Generation
Pair then Relation: Pair-Net for Panoptic Scene Graph Generation
Jinghao Wang
Zhengyu Wen
Xiangtai Li
Zujin Guo
Jingkang Yang
Ziwei Liu
51
17
0
17 Jul 2023
Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition
Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition
Qiulei Dong
Hong Wang
Qiulei Dong
30
0
0
14 Jul 2023
Improving Reference-based Distinctive Image Captioning with Contrastive
  Rewards
Improving Reference-based Distinctive Image Captioning with Contrastive Rewards
Yangjun Mao
Jun Xiao
Dong Zhang
Meng Cao
Jian Shao
Yueting Zhuang
Long Chen
EGVM
32
9
0
25 Jun 2023
Interactive and Explainable Region-guided Radiology Report Generation
Interactive and Explainable Region-guided Radiology Report Generation
Tim Tanida
Philip Muller
Georgios Kaissis
Daniel Rueckert
MedIm
43
110
0
17 Apr 2023
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic
  Scene Graph Generation
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation
Zijian Zhou
Miaojing Shi
Holger Caesar
34
18
0
28 Mar 2023
Multi-modal reward for visual relationships-based image captioning
Multi-modal reward for visual relationships-based image captioning
Ali Abedi
Hossein Karshenas
Peyman Adibi
44
2
0
19 Mar 2023
Learning Combinatorial Prompts for Universal Controllable Image
  Captioning
Learning Combinatorial Prompts for Universal Controllable Image Captioning
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
60
5
0
11 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based
  Polishing
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDL
DiffM
24
33
0
04 Mar 2023
HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval
HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval
Jie Guo
Meiting Wang
Yan Zhou
Bin Song
Yuhao Chi
Wei-liang Fan
Jianglong Chang
45
15
0
16 Dec 2022
Controllable Image Captioning via Prompting
Controllable Image Captioning via Prompting
Ning Wang
Jiahao Xie
Jihao Wu
Mingbo Jia
Linlin Li
24
23
0
04 Dec 2022
SGDraw: Scene Graph Drawing Interface Using Object-Oriented
  Representation
SGDraw: Scene Graph Drawing Interface Using Object-Oriented Representation
Tianyu Zhang
Xu Du
Chia-Ming Chang
Xi Yang
H. Xie
30
0
0
30 Nov 2022
CLID: Controlled-Length Image Descriptions with Limited Data
CLID: Controlled-Length Image Descriptions with Limited Data
Elad Hirsch
A. Tal
VLM
3DV
22
4
0
27 Nov 2022
Visual Semantic Parsing: From Images to Abstract Meaning Representation
Visual Semantic Parsing: From Images to Abstract Meaning Representation
M. A. Abdelsalam
Zhan Shi
Federico Fancellu
Kalliopi Basioti
Dhaivat Bhatt
Vladimir Pavlovic
Afsaneh Fazly
GNN
44
4
0
26 Oct 2022
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text
  Generation
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation
Yu Zhao
Jianguo Wei
Zhichao Lin
Yueheng Sun
Meishan Zhang
Hao Fei
27
16
0
20 Oct 2022
What Should the System Do Next?: Operative Action Captioning for
  Estimating System Actions
What Should the System Do Next?: Operative Action Captioning for Estimating System Actions
Taiki Nakamura
Seiya Kawano
Akishige Yuguchi
Yasutomo Kawanishi
Koichiro Yoshino
19
0
0
06 Oct 2022
A Survey on Graph Neural Networks and Graph Transformers in Computer
  Vision: A Task-Oriented Perspective
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective
Chaoqi Chen
Yushuang Wu
Qiyuan Dai
Hong-Yu Zhou
Mutian Xu
Sibei Yang
Xiaoguang Han
Yizhou Yu
ViT
MedIm
AI4CE
32
74
0
27 Sep 2022
Learning Distinct and Representative Styles for Image Captioning
Learning Distinct and Representative Styles for Image Captioning
Qi Chen
Chaorui Deng
Qi Wu
VLM
45
23
0
17 Sep 2022
Belief Revision based Caption Re-ranker with Visual Semantic Information
Belief Revision based Caption Re-ranker with Visual Semantic Information
Ahmed Sabir
Francesc Moreno-Noguer
Pranava Madhyastha
Lluís Padró
BDL
32
2
0
16 Sep 2022
Panoptic Scene Graph Generation
Panoptic Scene Graph Generation
Jingkang Yang
Yi Zhe Ang
Zujin Guo
Kaiyang Zhou
Wayne Zhang
Ziwei Liu
54
106
0
22 Jul 2022
Rethinking the Reference-based Distinctive Image Captioning
Rethinking the Reference-based Distinctive Image Captioning
Yangjun Mao
Long Chen
Zhihong Jiang
Dong Zhang
Zhimeng Zhang
Jian Shao
Jun Xiao
DiffM
30
22
0
22 Jul 2022
Improving Image Captioning with Control Signal of Sentence Quality
Improving Image Captioning with Control Signal of Sentence Quality
Zhangzi Zhu
Hong Qu
15
0
0
07 Jun 2022
Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
Mingjie Li
Wenjia Cai
Karin Verspoor
Shirui Pan
Xiaodan Liang
Xiaojun Chang
MedIm
41
35
0
04 Jun 2022
From Easy to Hard: Learning Language-guided Curriculum for Visual
  Question Answering on Remote Sensing Data
From Easy to Hard: Learning Language-guided Curriculum for Visual Question Answering on Remote Sensing Data
Zhenghang Yuan
Lichao Mou
Q. Wang
Xiao Xiang Zhu
27
62
0
06 May 2022
12
Next