ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.11566
  4. Cited By
Object Relational Graph with Teacher-Recommended Learning for Video
  Captioning

Object Relational Graph with Teacher-Recommended Learning for Video Captioning

26 February 2020
Ziqi Zhang
Yaya Shi
Chunfen Yuan
Bing Li
Peijin Wang
Weiming Hu
Zhengjun Zha
    VLM
ArXivPDFHTML

Papers citing "Object Relational Graph with Teacher-Recommended Learning for Video Captioning"

50 / 115 papers shown
Title
Thinking Hallucination for Video Captioning
Thinking Hallucination for Video Captioning
Nasib Ullah
Partha Pratim Mohanta
VLM
36
4
0
28 Sep 2022
A Survey on Graph Neural Networks and Graph Transformers in Computer
  Vision: A Task-Oriented Perspective
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective
Chaoqi Chen
Yushuang Wu
Qiyuan Dai
Hong-Yu Zhou
Mutian Xu
Sibei Yang
Xiaoguang Han
Yizhou Yu
ViT
MedIm
AI4CE
27
73
0
27 Sep 2022
Distribution Aware Metrics for Conditional Natural Language Generation
Distribution Aware Metrics for Conditional Natural Language Generation
David M. Chan
Yiming Ni
David A. Ross
Sudheendra Vijayanarasimhan
Austin Myers
John F. Canny
45
4
0
15 Sep 2022
Diverse Video Captioning by Adaptive Spatio-temporal Attention
Diverse Video Captioning by Adaptive Spatio-temporal Attention
Zohreh Ghaderi
Leonard Salewski
Hendrik P. A. Lensch
13
8
0
19 Aug 2022
Sports Video Analysis on Large-Scale Data
Sports Video Analysis on Large-Scale Data
Dekun Wu
Henghui Zhao
Xingce Bao
Richard P. Wildes
21
13
0
09 Aug 2022
Rethinking Data Augmentation for Robust Visual Question Answering
Rethinking Data Augmentation for Robust Visual Question Answering
Long Chen
Yuhang Zheng
Jun Xiao
OOD
27
42
0
18 Jul 2022
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional
  MoEs
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs
Jinguo Zhu
Xizhou Zhu
Wenhai Wang
Xiaohua Wang
Hongsheng Li
Xiaogang Wang
Jifeng Dai
MoMe
MoE
21
66
0
09 Jun 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
41
528
0
27 May 2022
A Survey on Long-Tailed Visual Recognition
A Survey on Long-Tailed Visual Recognition
Lu Yang
He Jiang
Q. Song
Jun Guo
13
123
0
27 May 2022
GL-RG: Global-Local Representation Granularity for Video Captioning
GL-RG: Global-Local Representation Granularity for Video Captioning
Liqi Yan
Qifan Wang
Yiming Cui
Fuli Feng
Xiaojun Quan
Xinming Zhang
Dongfang Liu
25
59
0
22 May 2022
Support-set based Multi-modal Representation Enhancement for Video
  Captioning
Support-set based Multi-modal Representation Enhancement for Video Captioning
Xiaoya Chen
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Hengtao Shen
24
4
0
19 May 2022
What's in a Caption? Dataset-Specific Linguistic Diversity and Its
  Effect on Visual Description Models and Metrics
What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
David M. Chan
Austin Myers
Sudheendra Vijayanarasimhan
David A. Ross
Bryan Seybold
John F. Canny
28
6
0
12 May 2022
Tragedy Plus Time: Capturing Unintended Human Activities from
  Weakly-labeled Videos
Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos
Arnav Chakravarthy
Zhiyuan Fang
Yezhou Yang
32
2
0
28 Apr 2022
Self-Supervised Learning of Object Parts for Semantic Segmentation
Self-Supervised Learning of Object Parts for Semantic Segmentation
A. Ziegler
Yuki M. Asano
SSL
OCL
26
101
0
27 Apr 2022
Video Captioning: a comparative review of where we are and which could
  be the route
Video Captioning: a comparative review of where we are and which could be the route
Daniela Moctezuma
Tania A. Ramirez-delreal
Guillermo Ruiz
Othón González-Chávez
21
11
0
12 Apr 2022
Deep Non-rigid Structure-from-Motion: A Sequence-to-Sequence Translation
  Perspective
Deep Non-rigid Structure-from-Motion: A Sequence-to-Sequence Translation Perspective
Huizhong Deng
Tong Zhang
Yuchao Dai
Jiawei Shi
Yiran Zhong
Hongdong Li
28
7
0
10 Apr 2022
Learning Audio-Video Modalities from Image Captions
Learning Audio-Video Modalities from Image Captions
Arsha Nagrani
Paul Hongsuck Seo
Bryan Seybold
Anja Hauth
Santiago Manén
Chen Sun
Cordelia Schmid
CLIP
16
82
0
01 Apr 2022
CREATE: A Benchmark for Chinese Short Video Retrieval and Title
  Generation
CREATE: A Benchmark for Chinese Short Video Retrieval and Title Generation
Ziqi Zhang
Yuxin Chen
Zongyang Ma
Zhongang Qi
Chunfen Yuan
Bing Li
Ying Shan
Weiming Hu
VGen
26
8
0
31 Mar 2022
Visual Abductive Reasoning
Visual Abductive Reasoning
Chen Liang
Wenguan Wang
Tianfei Zhou
Yi Yang
LRM
26
38
0
26 Mar 2022
ABN: Agent-Aware Boundary Networks for Temporal Action Proposal
  Generation
ABN: Agent-Aware Boundary Networks for Temporal Action Proposal Generation
Khoa T. Vo
Kashu Yamazaki
Sang Truong
M. Tran
Akihiro Sugimoto
Ngan Le
EgoV
21
9
0
16 Mar 2022
RCL: Recurrent Continuous Localization for Temporal Action Detection
RCL: Recurrent Continuous Localization for Temporal Action Detection
Qiang Wang
Yanhao Zhang
Yun Zheng
Pan Pan
ObjD
29
38
0
14 Mar 2022
Taking an Emotional Look at Video Paragraph Captioning
Taking an Emotional Look at Video Paragraph Captioning
Qinyu Li
Tengpeng Li
Hanli Wang
Changan Chen
19
4
0
12 Mar 2022
End-to-end Generative Pretraining for Multimodal Video Captioning
End-to-end Generative Pretraining for Multimodal Video Captioning
Paul Hongsuck Seo
Arsha Nagrani
Anurag Arnab
Cordelia Schmid
27
164
0
20 Jan 2022
Cross-modal Contrastive Distillation for Instructional Activity
  Anticipation
Cross-modal Contrastive Distillation for Instructional Activity Anticipation
Zhengyuan Yang
Jingen Liu
Jing-ling Huang
Xiaodong He
Tao Mei
Chenliang Xu
Jiebo Luo
31
6
0
18 Jan 2022
Boosting Video Representation Learning with Multi-Faceted Integration
Boosting Video Representation Learning with Multi-Faceted Integration
Zhaofan Qiu
Ting Yao
Chong-Wah Ngo
Xiaoping Zhang
Dong Wu
Tao Mei
31
8
0
11 Jan 2022
Synchronized Audio-Visual Frames with Fractional Positional Encoding for
  Transformers in Video-to-Text Translation
Synchronized Audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to-Text Translation
Philipp Harzig
Moritz Einfalt
Rainer Lienhart
ViT
34
2
0
28 Dec 2021
CoCo-BERT: Improving Video-Language Pre-training with Contrastive
  Cross-modal Matching and Denoising
CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Hongyang Chao
Tao Mei
VLM
18
41
0
14 Dec 2021
Uni-Perceiver: Pre-training Unified Architecture for Generic Perception
  for Zero-shot and Few-shot Tasks
Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks
Xizhou Zhu
Jinguo Zhu
Hao Li
Xiaoshi Wu
Xiaogang Wang
Hongsheng Li
Xiaohua Wang
Jifeng Dai
53
129
0
02 Dec 2021
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does
  Matter
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter
Bang-ju Yang
Tong Zhang
Yuexian Zou
CLIP
25
20
0
30 Nov 2021
SwinBERT: End-to-End Transformers with Sparse Attention for Video
  Captioning
SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning
Kevin Qinghong Lin
Linjie Li
Chung-Ching Lin
Faisal Ahmed
Zhe Gan
Zicheng Liu
Yumao Lu
Lijuan Wang
ViT
19
235
0
25 Nov 2021
Hierarchical Modular Network for Video Captioning
Hierarchical Modular Network for Video Captioning
Hanhua Ye
Guorong Li
Yuankai Qi
Shuhui Wang
Qingming Huang
Ming-Hsuan Yang
19
67
0
24 Nov 2021
DVCFlow: Modeling Information Flow Towards Human-like Video Captioning
DVCFlow: Modeling Information Flow Towards Human-like Video Captioning
Xu Yan
Zhengcong Fei
Shuhui Wang
Qingming Huang
Qi Tian
VGen
40
4
0
19 Nov 2021
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained
  Embedding Matching
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching
Yaya Shi
Xu Yang
Haiyang Xu
Chunfen Yuan
Bing Li
Weiming Hu
Zhengjun Zha
39
33
0
17 Nov 2021
Co-segmentation Inspired Attention Module for Video-based Computer
  Vision Tasks
Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks
Arulkumar Subramaniam
Jayesh Vaidya
Muhammed Ameen
Athira M. Nambiar
Anurag Mittal
19
7
0
14 Nov 2021
CLIP4Caption: CLIP for Video Caption
CLIP4Caption: CLIP for Video Caption
Mingkang Tang
Zhanyu Wang
Zhenhua Liu
Fengyun Rao
Dian Li
Xiu Li
CLIP
VLM
35
149
0
13 Oct 2021
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal
  Attention
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention
Katsuyuki Nakamura
Hiroki Ohashi
Mitsuhiro Okada
EgoV
31
12
0
07 Sep 2021
Self-Supervised Visual Representations Learning by Contrastive Mask
  Prediction
Self-Supervised Visual Representations Learning by Contrastive Mask Prediction
Yucheng Zhao
Guangting Wang
Chong Luo
Wenjun Zeng
Zhengjun Zha
ISeg
SSL
27
46
0
18 Aug 2021
Learning Conditional Knowledge Distillation for Degraded-Reference Image
  Quality Assessment
Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment
Heliang Zheng
Huan Yang
Jianlong Fu
Zhengjun Zha
Jiebo Luo
23
41
0
18 Aug 2021
Cross-Modal Graph with Meta Concepts for Video Captioning
Cross-Modal Graph with Meta Concepts for Video Captioning
Hao Wang
Guosheng Lin
S. Hoi
C. Miao
20
6
0
14 Aug 2021
Joint Inductive and Transductive Learning for Video Object Segmentation
Joint Inductive and Transductive Learning for Video Object Segmentation
Yunyao Mao
Ning Wang
Wen-gang Zhou
Houqiang Li
VOS
19
98
0
08 Aug 2021
Discriminative Latent Semantic Graph for Video Captioning
Discriminative Latent Semantic Graph for Video Captioning
Yang Bai
Junyan Wang
Yang Long
Bingzhang Hu
Yang Song
M. Pagnucco
Yu Guan
46
31
0
08 Aug 2021
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable
  Video Captioning
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Bang-ju Yang
Shen Ge
Yuexian Zou
Xu Sun
21
32
0
05 Aug 2021
Exploring Sequence Feature Alignment for Domain Adaptive Detection
  Transformers
Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers
Wen Wang
Yang Cao
Jing Zhang
Fengxiang He
Zhengjun Zha
Yonggang Wen
Dacheng Tao
ViT
29
94
0
27 Jul 2021
Boosting Video Captioning with Dynamic Loss Network
Boosting Video Captioning with Dynamic Loss Network
Nasib Ullah
Partha Pratim Mohanta
22
1
0
25 Jul 2021
Disentangle Your Dense Object Detector
Disentangle Your Dense Object Detector
Zehui Chen
Chenhongyi Yang
Qiaofei Li
Feng Zhao
Zhengjun Zha
Feng Wu
3DV
22
147
0
07 Jul 2021
DnS: Distill-and-Select for Efficient and Accurate Video Indexing and
  Retrieval
DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval
Giorgos Kordopatis-Zilos
Christos Tzelepis
Symeon Papadopoulos
I. Kompatsiaris
Ioannis Patras
27
33
0
24 Jun 2021
Towards Diverse Paragraph Captioning for Untrimmed Videos
Towards Diverse Paragraph Captioning for Untrimmed Videos
Yuqing Song
Shizhe Chen
Qin Jin
13
37
0
30 May 2021
TransVG: End-to-End Visual Grounding with Transformers
TransVG: End-to-End Visual Grounding with Transformers
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
28
330
0
17 Apr 2021
A Comprehensive Review of the Video-to-Text Problem
A Comprehensive Review of the Video-to-Text Problem
Jesus Perez-Martin
B. Bustos
S. Guimarães
I. Sipiran
Jorge A. Pérez
Grethel Coello Said
13
17
0
27 Mar 2021
Open-book Video Captioning with Retrieve-Copy-Generate Network
Open-book Video Captioning with Retrieve-Copy-Generate Network
Ziqi Zhang
Zhongang Qi
C. Yuan
Ying Shan
Bing Li
Ying Deng
Weiming Hu
28
92
0
09 Mar 2021
Previous
123
Next