Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.02436
Cited By
Talking-Heads Attention
5 March 2020
Noam M. Shazeer
Zhenzhong Lan
Youlong Cheng
Nan Ding
L. Hou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Talking-Heads Attention"
16 / 16 papers shown
Title
Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets
Tianxiao Zhang
Wenju Xu
Bo Luo
Guanghui Wang
ViT
MDE
40
7
0
28 Jul 2024
Finding Stakeholder-Material Information from 10-K Reports using Fine-Tuned BERT and LSTM Models
V. Z. Chen
24
0
0
15 Aug 2023
Semantic Feature Integration network for Fine-grained Visual Classification
Haibo Wang
Yueyang Li
Haichi Luo
42
0
0
13 Feb 2023
EIT: Enhanced Interactive Transformer
Tong Zheng
Bei Li
Huiwen Bao
Tong Xiao
Jingbo Zhu
29
2
0
20 Dec 2022
Rethinking Vision Transformers for MobileNet Size and Speed
Yanyu Li
Ju Hu
Yang Wen
Georgios Evangelidis
Kamyar Salahi
Yanzhi Wang
Sergey Tulyakov
Jian Ren
ViT
30
159
0
15 Dec 2022
BJTU-WeChat's Systems for the WMT22 Chat Translation Task
Yunlong Liang
Fandong Meng
Jinan Xu
Yufeng Chen
Jie Zhou
16
2
0
28 Nov 2022
FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration
Yangjun Wu
Kebin Fang
Yao Zhao
Hao Zhang
Lifeng Shi
Mengqi Zhang
34
0
0
09 Nov 2022
MiniViT: Compressing Vision Transformers with Weight Multiplexing
Jinnian Zhang
Houwen Peng
Kan Wu
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
23
123
0
14 Apr 2022
Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Yangyang Shi
Chunyang Wu
Dilin Wang
Alex Xiao
Jay Mahadeokar
...
Ke Li
Yuan Shangguan
Varun K. Nagaraja
Ozlem Kalinli
M. Seltzer
28
15
0
07 Oct 2021
WeChat Neural Machine Translation Systems for WMT21
Xianfeng Zeng
Yanjun Liu
Ernan Li
Qiu Ran
Fandong Meng
Peng Li
Jinan Xu
Jie Zhou
25
20
0
05 Aug 2021
MedGPT: Medical Concept Prediction from Clinical Narratives
Z. Kraljevic
Anthony Shek
D. Bean
R. Bendayan
J. Teo
Richard J. B. Dobson
LM&MA
AI4TS
MedIm
23
39
0
07 Jul 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
32
1,087
0
08 Jun 2021
Vision Transformers with Patch Diversification
Chengyue Gong
Dilin Wang
Meng Li
Vikas Chandra
Qiang Liu
ViT
37
62
0
26 Apr 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
25
986
0
31 Mar 2021
Global Attention based Graph Convolutional Neural Networks for Improved Materials Property Prediction
Steph-Yves M. Louis
Yong Zhao
Alireza Nasiri
Xiran Wong
Yuqi Song
Fei Liu
Jianjun Hu
AI4CE
17
15
0
11 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1