ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.14262
  4. Cited By
SACT: Self-Aware Multi-Space Feature Composition Transformer for
  Multinomial Attention for Video Captioning

SACT: Self-Aware Multi-Space Feature Composition Transformer for Multinomial Attention for Video Captioning

25 June 2020
C. Sur
ArXivPDFHTML

Papers citing "SACT: Self-Aware Multi-Space Feature Composition Transformer for Multinomial Attention for Video Captioning"

25 / 25 papers shown
Title
The Evolved Transformer
The Evolved Transformer
David R. So
Chen Liang
Quoc V. Le
ViT
66
461
0
30 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
753
93,936
0
11 Oct 2018
End-to-End Dense Video Captioning with Masked Transformer
End-to-End Dense Video Captioning with Masked Transformer
Luowei Zhou
Yingbo Zhou
Jason J. Corso
R. Socher
Caiming Xiong
59
527
0
03 Apr 2018
ActivityNet Challenge 2017 Summary
ActivityNet Challenge 2017 Summary
Guohao Li
Juan Carlos Niebles
Cees G. M. Snoek
Fabian Caba Heilbron
Humam Alwassel
Ranjay Krishna
Victor Escorcia
Kenji Hata
S. Buch
61
48
0
22 Oct 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
304
129,831
0
12 Jun 2017
A Deep Reinforced Model for Abstractive Summarization
A Deep Reinforced Model for Abstractive Summarization
Romain Paulus
Caiming Xiong
R. Socher
AI4TS
118
1,551
0
11 May 2017
Dense-Captioning Events in Videos
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
113
1,225
0
02 May 2017
Towards Automatic Learning of Procedures from Web Instructional Videos
Towards Automatic Learning of Procedures from Web Instructional Videos
Luowei Zhou
Chenliang Xu
Jason J. Corso
EgoV
49
812
0
28 Mar 2017
TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals
TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals
J. Gao
Zhenheng Yang
Chen Sun
Kan Chen
Ram Nevatia
ViT
AI4TS
38
461
0
17 Mar 2017
A Structured Self-attentive Sentence Embedding
A Structured Self-attentive Sentence Embedding
Zhouhan Lin
Minwei Feng
Cicero Nogueira dos Santos
Mo Yu
Bing Xiang
Bowen Zhou
Yoshua Bengio
87
2,132
0
09 Mar 2017
Semantic Compositional Networks for Visual Captioning
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
66
426
0
23 Nov 2016
Video Captioning with Transferred Semantic Attributes
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
38
328
0
23 Nov 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
  Challenge
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
54
852
0
21 Sep 2016
Video Summarization with Long Short-term Memory
Video Summarization with Long Short-term Memory
Ke Zhang
Wei-Lun Chao
Fei Sha
Kristen Grauman
53
684
0
26 May 2016
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Haonan Yu
Jiang Wang
Zhiheng Huang
Yi Yang
Wenyuan Xu
62
560
0
26 Oct 2015
A Multi-scale Multiple Instance Video Description Network
A Multi-scale Multiple Instance Video Description Network
Huijuan Xu
Subhashini Venugopalan
Vasili Ramanishka
Marcus Rohrbach
Kate Saenko
42
64
0
21 May 2015
Sequence to Sequence -- Video to Text
Sequence to Sequence -- Video to Text
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
82
1,417
0
03 May 2015
Describing Videos by Exploiting Temporal Structure
Describing Videos by Exploiting Temporal Structure
L. Yao
Atousa Torabi
Kyunghyun Cho
Nicolas Ballas
C. Pal
Hugo Larochelle
Aaron Courville
105
1,063
0
27 Feb 2015
Translating Videos to Natural Language Using Deep Recurrent Neural
  Networks
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
Subhashini Venugopalan
Huijuan Xu
Jeff Donahue
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
74
951
0
15 Dec 2014
Show and Tell: A Neural Image Caption Generator
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
173
6,009
0
17 Nov 2014
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
240
20,467
0
10 Sep 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
655
99,991
0
04 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
313
27,205
0
01 Sep 2014
Video In Sentences Out
Video In Sentences Out
Andrei Barbu
Alexander Bridge
Zachary Burchill
D. Coroian
Sven J. Dickinson
...
Jarrell W. Waggoner
Song Wang
Jinlian Wei
Yifan Yin
Zhiqi Zhang
23
155
0
09 Aug 2014
Coherent Multi-Sentence Video Description with Variable Level of Detail
Coherent Multi-Sentence Video Description with Variable Level of Detail
Anna Rohrbach
Marcus Rohrbach
Weijian Qiu
Annemarie Friedrich
Sikandar Amin
Mykhaylo Andriluka
Manfred Pinkal
Bernt Schiele
40
217
0
24 Mar 2014
1