ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1411.4952
  4. Cited By
From Captions to Visual Concepts and Back

From Captions to Visual Concepts and Back

18 November 2014
Hao Fang
Saurabh Gupta
F. Iandola
R. Srivastava
Li Deng
Piotr Dollár
Jianfeng Gao
Xiaodong He
Margaret Mitchell
John C. Platt
C. L. Zitnick
Geoffrey Zweig
    VLM
ArXivPDFHTML

Papers citing "From Captions to Visual Concepts and Back"

50 / 213 papers shown
Title
On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
Yichao Cai
Yuhang Liu
Erdun Gao
Tianjiao Jiang
Zhen Zhang
Anton van den Hengel
Javen Qinfeng Shi
62
0
0
14 Apr 2025
Lie Detector: Unified Backdoor Detection via Cross-Examination Framework
Lie Detector: Unified Backdoor Detection via Cross-Examination Framework
Xiaobei Wang
Siyuan Liang
Dongping Liao
Han Fang
Aishan Liu
Xiaochun Cao
Yu-liang Lu
E. Chang
X. Gao
AAML
50
1
0
21 Mar 2025
A Language Anchor-Guided Method for Robust Noisy Domain Generalization
A Language Anchor-Guided Method for Robust Noisy Domain Generalization
Zilin Dai
Lehong Wang
Fangzhou Lin
Yidong Wang
Zhigang Li
Kazunori D Yamada
Ziming Zhang
Wang Lu
152
0
0
21 Mar 2025
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models
Prannay Kaul
Zhizhong Li
Hao Yang
Yonatan Dukler
Ashwin Swaminathan
C. Taylor
Stefano Soatto
HILM
63
16
0
08 May 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
134
371
0
09 Feb 2024
See, Say, and Segment: Teaching LMMs to Overcome False Premises
See, Say, and Segment: Teaching LMMs to Overcome False Premises
Tsung-Han Wu
Giscard Biamby
David M. Chan
Lisa Dunlap
Ritwik Gupta
Xudong Wang
Joseph E. Gonzalez
Trevor Darrell
VLM
MLLM
39
18
0
13 Dec 2023
Reverse Stable Diffusion: What prompt was used to generate this image?
Reverse Stable Diffusion: What prompt was used to generate this image?
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
VLM
DiffM
42
6
0
02 Aug 2023
Guided Focal Stack Refinement Network for Light Field Salient Object
  Detection
Guided Focal Stack Refinement Network for Light Field Salient Object Detection
B. Yuan
Yao Jiang
Keren Fu
Qijun Zhao
34
9
0
09 May 2023
Towards Diverse Binary Segmentation via A Simple yet General Gated
  Network
Towards Diverse Binary Segmentation via A Simple yet General Gated Network
Xiaoqi Zhao
Youwei Pang
Lihe Zhang
Huchuan Lu
Lei Zhang
28
14
0
18 Mar 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
31
4
0
08 Feb 2023
Using Multiple Instance Learning to Build Multimodal Representations
Using Multiple Instance Learning to Build Multimodal Representations
Peiqi Wang
W. Wells
Seth Berkowitz
Steven Horng
Polina Golland
SSL
24
6
0
11 Dec 2022
Weakly Supervised Face Naming with Symmetry-Enhanced Contrastive Loss
Weakly Supervised Face Naming with Symmetry-Enhanced Contrastive Loss
Tingyu Qu
Tinne Tuytelaars
Marie-Francine Moens
CVBM
21
4
0
17 Oct 2022
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
J. Jose
30
26
0
17 Oct 2022
Word to Sentence Visual Semantic Similarity for Caption Generation:
  Lessons Learned
Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned
Ahmed Sabir
19
0
0
26 Sep 2022
Belief Revision based Caption Re-ranker with Visual Semantic Information
Belief Revision based Caption Re-ranker with Visual Semantic Information
Ahmed Sabir
Francesc Moreno-Noguer
Pranava Madhyastha
Lluís Padró
BDL
29
2
0
16 Sep 2022
Every picture tells a story: Image-grounded controllable stylistic story
  generation
Every picture tells a story: Image-grounded controllable stylistic story generation
Holy Lovenia
Bryan Wilie
Romain Barraud
Samuel Cahyawijaya
Willy Chung
Pascale Fung
26
8
0
04 Sep 2022
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for
  Image-Text Retrieval
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
Haoran Wang
Dongliang He
Wenhao Wu
Boyang Xia
Min Yang
Fu Li
YunLong Yu
Zhong Ji
Errui Ding
Jingdong Wang
30
23
0
21 Aug 2022
Image Captioning based on Feature Refinement and Reflective Decoding
Image Captioning based on Feature Refinement and Reflective Decoding
G. Alabduljabbar
Hafida Benhidour
Said Kerrache
3DV
22
3
0
16 Jun 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
26
88
0
14 Jun 2022
SelfReformer: Self-Refined Network with Transformer for Salient Object
  Detection
SelfReformer: Self-Refined Network with Transformer for Salient Object Detection
Y. Yun
Weisi Lin
ViT
60
28
0
23 May 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient
  Image Captioning
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
29
15
0
11 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
19
89
0
31 Jan 2022
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi
Xiuye Gu
Huayu Chen
Nayeon Lee
VLM
41
371
0
22 Dec 2021
Syntax Customized Video Captioning by Imitating Exemplar Sentences
Syntax Customized Video Captioning by Imitating Exemplar Sentences
Yitian Yuan
Lin Ma
Wenwu Zhu
22
6
0
02 Dec 2021
Neural Attention for Image Captioning: Review of Outstanding Methods
Neural Attention for Image Captioning: Review of Outstanding Methods
Zanyar Zohourianshahzadi
Jugal Kalita
VLM
32
45
0
29 Nov 2021
R$^3$Net:Relation-embedded Representation Reconstruction Network for
  Change Captioning
R3^33Net:Relation-embedded Representation Reconstruction Network for Change Captioning
Yunbin Tu
Liang Li
C. Yan
Shengxiang Gao
Zhengtao Yu
30
22
0
20 Oct 2021
Improving Joint Learning of Chest X-Ray and Radiology Report by Word
  Region Alignment
Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment
Zhanghexuan Ji
Mohammad Abuzar Shaikh
Dana Moukheiber
S. Srihari
Yifan Peng
Mingchen Gao
SSL
16
20
0
04 Sep 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
67
254
0
14 Jul 2021
UMIC: An Unreferenced Metric for Image Captioning via Contrastive
  Learning
UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning
Hwanhee Lee
Seunghyun Yoon
Franck Dernoncourt
Trung Bui
Kyomin Jung
VLM
21
44
0
26 Jun 2021
Visual Question Rewriting for Increasing Response Rate
Visual Question Rewriting for Increasing Response Rate
Jiayi Wei
Xilian Li
Yi Zhang
Xin Eric Wang
28
2
0
04 Jun 2021
Longer Version for "Deep Context-Encoding Network for Retinal Image
  Captioning"
Longer Version for "Deep Context-Encoding Network for Retinal Image Captioning"
Jia-Hong Huang
Ting-Wei Wu
Chao-Han Huck Yang
M. Worring
MedIm
20
28
0
30 May 2021
Recursive Contour Saliency Blending Network for Accurate Salient Object
  Detection
Recursive Contour Saliency Blending Network for Accurate Salient Object Detection
Y. Yun
Takahiro Tsubono
48
58
0
28 May 2021
CAGAN: Text-To-Image Generation with Combined Attention GANs
CAGAN: Text-To-Image Generation with Combined Attention GANs
Henning Schulze
Dogucan Yaman
Alexander Waibel
GAN
29
3
0
26 Apr 2021
MobileSal: Extremely Efficient RGB-D Salient Object Detection
MobileSal: Extremely Efficient RGB-D Salient Object Detection
Yu-Huan Wu
Yun-Hai Liu
Jun Xu
Jiawang Bian
Yuchao Gu
Ming-Ming Cheng
31
104
0
24 Dec 2020
Image Captioning with Context-Aware Auxiliary Guidance
Image Captioning with Context-Aware Auxiliary Guidance
Zeliang Song
Xiaofei Zhou
Zhendong Mao
Jianlong Tan
36
31
0
10 Dec 2020
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase
  Grounding
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding
Qinxin Wang
Hao Tan
Sheng Shen
Michael W. Mahoney
Z. Yao
ObjD
47
11
0
12 Oct 2020
Teacher-Critical Training Strategies for Image Captioning
Teacher-Critical Training Strategies for Image Captioning
Yiqing Huang
Jiansheng Chen
VLM
29
8
0
30 Sep 2020
Where is the Model Looking At?--Concentrate and Explain the Network
  Attention
Where is the Model Looking At?--Concentrate and Explain the Network Attention
Wenjia Xu
Jiuniu Wang
Yang Wang
Guangluan Xu
Wei Dai
Yirong Wu
XAI
29
17
0
29 Sep 2020
Efficient Urdu Caption Generation using Attention based LSTM
Efficient Urdu Caption Generation using Attention based LSTM
Inaam Ilahi
Hafiz Muhammad Abdullah Zia
Ahtazaz Ehsan
Rauf Tabassam
Armaghan Ahmed
VLM
16
2
0
02 Aug 2020
Suppress and Balance: A Simple Gated Network for Salient Object
  Detection
Suppress and Balance: A Simple Gated Network for Salient Object Detection
Xiaoqi Zhao
Youwei Pang
Lihe Zhang
Huchuan Lu
Lei Zhang
19
414
0
16 Jul 2020
A Single Stream Network for Robust and Real-time RGB-D Salient Object
  Detection
A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection
Xiaoqi Zhao
Lihe Zhang
Youwei Pang
Huchuan Lu
Lei Zhang
ObjD
3DH
MDE
30
192
0
14 Jul 2020
Improving Image Captioning with Better Use of Captions
Improving Image Captioning with Better Use of Captions
Zhan Shi
Xu Zhou
Xipeng Qiu
Xiao-Dan Zhu
30
122
0
21 Jun 2020
Mitigating Gender Bias in Captioning Systems
Mitigating Gender Bias in Captioning Systems
Ruixiang Tang
Mengnan Du
Yuening Li
Zirui Liu
Na Zou
Xia Hu
FaML
11
64
0
15 Jun 2020
Multimodal Generative Learning Utilizing Jensen-Shannon-Divergence
Multimodal Generative Learning Utilizing Jensen-Shannon-Divergence
Thomas M. Sutter
Imant Daunhawer
Julia E. Vogt
36
67
0
15 Jun 2020
BLEURT: Learning Robust Metrics for Text Generation
BLEURT: Learning Robust Metrics for Text Generation
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
46
1,446
0
09 Apr 2020
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic
  Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO
  Framework
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework
C. Sur
27
7
0
16 Feb 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image
  Captioning With R-CNN Feature Distribution Composition (FDC)
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)
C. Sur
25
16
0
15 Feb 2020
Robust Explanations for Visual Question Answering
Robust Explanations for Visual Question Answering
Badri N. Patro
Shivansh Pate
Vinay P. Namboodiri
OOD
AAML
25
20
0
23 Jan 2020
Personalizing Fast-Forward Videos Based on Visual and Textual Features
  from Social Network
Personalizing Fast-Forward Videos Based on Visual and Textual Features from Social Network
W. Ramos
M. Silva
Edson Roteia Araujo Junior
Alan C. Neves
Erickson R. Nascimento
22
6
0
29 Dec 2019
Fast Image Caption Generation with Position Alignment
Fast Image Caption Generation with Position Alignment
Z. Fei
25
37
0
13 Dec 2019
12345
Next