Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.10077
Cited By
Are Transformers universal approximators of sequence-to-sequence functions?
20 December 2019
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Are Transformers universal approximators of sequence-to-sequence functions?"
50 / 246 papers shown
Title
Exphormer: Sparse Transformers for Graphs
Hamed Shirzad
A. Velingker
B. Venkatachalam
Danica J. Sutherland
A. Sinop
11
99
0
10 Mar 2023
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding
Yuchen Li
Yuan-Fang Li
Andrej Risteski
120
61
0
07 Mar 2023
Ultra-High-Resolution Detector Simulation with Intra-Event Aware GAN and Self-Supervised Relational Reasoning
H. Hashemi
Nikolai Hartmann
Sahand Sharifzadeh
James Kahn
T. Kuhr
30
33
0
07 Mar 2023
Sampled Transformer for Point Sets
Shidi Li
Christian J. Walder
Alexander Soen
Lexing Xie
Miaomiao Liu
3DPC
38
1
0
28 Feb 2023
A Brief Survey on the Approximation Theory for Sequence Modelling
Hao Jiang
Qianxiao Li
Zhong Li
Shida Wang
AI4TS
32
12
0
27 Feb 2023
Testing AI on language comprehension tasks reveals insensitivity to underlying meaning
Vittoria Dentella
Fritz Guenther
Elliot Murphy
G. Marcus
Evelina Leivada
ELM
40
27
0
23 Feb 2023
Language Model Crossover: Variation through Few-Shot Prompting
Elliot Meyerson
M. Nelson
Herbie Bradley
Adam Gaier
Arash Moradi
Amy K. Hoover
Joel Lehman
VLM
45
79
0
23 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
Ming Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
37
57
0
12 Feb 2023
Knowledge Distillation in Vision Transformers: A Critical Review
Gousia Habib
Tausifa Jan Saleem
Brejesh Lall
31
15
0
04 Feb 2023
REaLTabFormer: Generating Realistic Relational and Tabular Data using Transformers
Aivin V. Solatorio
Olivier Dupriez
LMTD
26
62
0
04 Feb 2023
Attention Link: An Efficient Attention-Based Low Resource Machine Translation Architecture
Zeping Min
19
0
0
01 Feb 2023
Looped Transformers as Programmable Computers
Angeliki Giannou
Shashank Rajput
Jy-yong Sohn
Kangwook Lee
Jason D. Lee
Dimitris Papailiopoulos
15
96
0
30 Jan 2023
Tighter Bounds on the Expressivity of Transformer Encoders
David Chiang
Peter A. Cholak
A. Pillay
27
53
0
25 Jan 2023
Improving Faithfulness of Abstractive Summarization by Controlling Confounding Effect of Irrelevant Sentences
Asish Ghoshal
Arash Einolghozati
A. Arun
Haoran Li
L. Yu
Vera Gor
Yashar Mehdad
Scott Yih
Asli Celikyilmaz
HILM
29
1
0
19 Dec 2022
PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation
Maxwell A. Xu
Alexander Moreno
Supriya Nagesh
V. Aydemir
D. Wetter
Santosh Kumar
James M. Rehg
AI4TS
24
7
0
14 Dec 2022
What learning algorithm is in-context learning? Investigations with linear models
Ekin Akyürek
Dale Schuurmans
Jacob Andreas
Tengyu Ma
Denny Zhou
34
447
0
28 Nov 2022
Minimal Width for Universal Property of Deep RNN
Changhoon Song
Geonho Hwang
Jun ho Lee
Myung-joo Kang
25
9
0
25 Nov 2022
Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost
Sungjun Cho
Seonwoo Min
Jinwoo Kim
Moontae Lee
Honglak Lee
Seunghoon Hong
42
3
0
27 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
40
49
0
25 Oct 2022
Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences
Aosong Feng
Irene Z Li
Yuang Jiang
Rex Ying
32
18
0
21 Oct 2022
How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders
Qi Zhang
Yifei Wang
Yisen Wang
28
73
0
15 Oct 2022
Why self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries
Chao Ma
Lexing Ying
27
2
0
13 Oct 2022
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence Learning Ability
Yufan Zhuang
Zihan Wang
Fangbo Tao
Jingbo Shang
ViT
AI4TS
35
3
0
05 Oct 2022
Neural Integral Equations
E. Zappala
Antonio H. O. Fonseca
J. O. Caro
David van Dijk
33
10
0
30 Sep 2022
On The Computational Complexity of Self-Attention
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
C. Hegde
73
109
0
11 Sep 2022
Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms
Surbhi Goel
Sham Kakade
Adam Tauman Kalai
Cyril Zhang
34
1
0
01 Sep 2022
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Shivam Garg
Dimitris Tsipras
Percy Liang
Gregory Valiant
39
459
0
01 Aug 2022
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
43
191
0
06 Jul 2022
Learning Functions on Multiple Sets using Multi-Set Transformers
Kira A. Selby
Ahmad Rashid
I. Kobyzev
Mehdi Rezagholizadeh
Pascal Poupart
ViT
30
1
0
30 Jun 2022
VReBERT: A Simple and Flexible Transformer for Visual Relationship Detection
Yunbo Cui
M. Farazi
ViT
25
1
0
18 Jun 2022
Your Transformer May Not be as Powerful as You Expect
Shengjie Luo
Shanda Li
Shuxin Zheng
Tie-Yan Liu
Liwei Wang
Di He
70
50
0
26 May 2022
Fix Bugs with Transformer through a Neural-Symbolic Edit Grammar
Yaojie Hu
Xingjian Shi
Qiang Zhou
Lee Pike
KELM
22
13
0
13 Apr 2022
Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity
Sophie Hao
Dana Angluin
Robert Frank
16
71
0
13 Apr 2022
Overcoming a Theoretical Limitation of Self-Attention
David Chiang
Peter A. Cholak
36
79
0
24 Feb 2022
Attention Enables Zero Approximation Error
Zhiying Fang
Yidong Ouyang
Ding-Xuan Zhou
Guang Cheng
18
5
0
24 Feb 2022
Spherical Transformer
Sungmin Cho
Raehyuk Jung
Junseok Kwon
ViT
10
10
0
10 Feb 2022
How Expressive are Transformers in Spectral Domain for Graphs?
Anson Bastos
Abhishek Nadgeri
Kuldeep Singh
H. Kanezashi
Toyotaro Suzumura
I. Mulang'
27
12
0
23 Jan 2022
Self-attention Presents Low-dimensional Knowledge Graph Embeddings for Link Prediction
Peyman Baghershahi
Reshad Hosseini
H. Moradi
34
52
0
20 Dec 2021
Trees in transformers: a theoretical analysis of the Transformer's ability to represent trees
Qi He
João Sedoc
J. Rodu
19
1
0
16 Dec 2021
Large Language Models are not Models of Natural Language: they are Corpus Models
Csaba Veres
14
18
0
13 Dec 2021
Programming with Neural Surrogates of Programs
Alex Renda
Yi Ding
Michael Carbin
21
2
0
12 Dec 2021
On the rate of convergence of a classifier based on a Transformer encoder
Iryna Gurevych
Michael Kohler
Gözde Gül Sahin
11
11
0
29 Nov 2021
CpT: Convolutional Point Transformer for 3D Point Cloud Processing
Chaitanya Kaul
Joshua Mitton
H. Dai
Roderick Murray-Smith
3DPC
33
6
0
21 Nov 2021
Can Vision Transformers Perform Convolution?
Shanda Li
Xiangning Chen
Di He
Cho-Jui Hsieh
ViT
49
19
0
02 Nov 2021
Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs
Jinwoo Kim
Saeyoon Oh
Seunghoon Hong
AI4CE
22
41
0
27 Oct 2021
Sinkformers: Transformers with Doubly Stochastic Attention
Michael E. Sander
Pierre Ablin
Mathieu Blondel
Gabriel Peyré
32
77
0
22 Oct 2021
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Cyril Zhang
27
117
0
19 Oct 2021
Iterative Decoding for Compositional Generalization in Transformers
Luana Ruiz
Joshua Ainslie
Santiago Ontañón
30
6
0
08 Oct 2021
Pretrained Language Models are Symbolic Mathematics Solvers too!
Kimia Noorbakhsh
Modar Sulaiman
M. Sharifi
Kallol Roy
Pooyan Jamshidi
LRM
28
18
0
07 Oct 2021
Universal Approximation Under Constraints is Possible with Transformers
Anastasis Kratsios
Behnoosh Zamanlooy
Tianlin Liu
Ivan Dokmanić
58
26
0
07 Oct 2021
Previous
1
2
3
4
5
Next