Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.10077
Cited By
Are Transformers universal approximators of sequence-to-sequence functions?
20 December 2019
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Are Transformers universal approximators of sequence-to-sequence functions?"
50 / 246 papers shown
Title
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought
Hanlin Zhu
Shibo Hao
Zhiting Hu
Jiantao Jiao
Stuart Russell
Yuandong Tian
OffRL
LRM
4
0
0
18 May 2025
Block-Biased Mamba for Long-Range Sequence Processing
Annan Yu
N. Benjamin Erichson
Mamba
43
0
0
13 May 2025
Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights
Zhaiming Shen
Alex Havrilla
Rongjie Lai
A. Cloninger
Wenjing Liao
39
0
0
06 May 2025
Multimodal and Multiview Deep Fusion for Autonomous Marine Navigation
Dimitrios Dagdilelis
Panagiotis Grigoriadis
R. Galeazzi
3DPC
174
0
0
02 May 2025
Dual Filter: A Mathematical Framework for Inference using Transformer-like Architectures
Heng-Sheng Chang
P. Mehta
39
0
0
01 May 2025
Embedding Empirical Distributions for Computing Optimal Transport Maps
Mingchen Jiang
Peng Xu
Xichen Ye
Xiaohui Chen
Yun Yang
Yifan Chen
OT
58
0
0
24 Apr 2025
Transformers Can Overcome the Curse of Dimensionality: A Theoretical Study from an Approximation Perspective
Yuling Jiao
Yanming Lai
Yang Wang
Bokai Yan
39
0
0
18 Apr 2025
Approximation Bounds for Transformer Networks with Application to Regression
Yuling Jiao
Yanming Lai
Defeng Sun
Yang Wang
Bokai Yan
31
0
0
16 Apr 2025
Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)
Lena Strobl
Dana Angluin
Robert Frank
38
0
0
28 Mar 2025
Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency
Jiangxuan Long
Zhao Song
Chiwun Yang
AI4TS
186
0
0
18 Mar 2025
Learning on LLM Output Signatures for gray-box LLM Behavior Analysis
Guy Bar-Shalom
Fabrizio Frasca
Derek Lim
Yoav Gelberg
Yftah Ziser
Ran El-Yaniv
Gal Chechik
Haggai Maron
67
0
0
18 Mar 2025
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
Alireza Mousavi-Hosseini
Clayton Sanford
Denny Wu
Murat A. Erdogdu
50
0
0
14 Mar 2025
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
LRM
52
0
0
13 Mar 2025
Context-aware Biases for Length Extrapolation
Ali Veisi
Amir Mansourian
55
0
0
11 Mar 2025
Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability
Chenhui Xu
Dancheng Liu
Jiajie Li
Amir Nassereldine
Zhaohui Li
Jinjun Xiong
LRM
70
0
0
05 Mar 2025
Deep Causal Behavioral Policy Learning: Applications to Healthcare
Jonas Knecht
Anna Zink
Jonathan Kolstad
Maya Petersen
CML
88
0
0
05 Mar 2025
Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers
Gilad Yehudai
Clayton Sanford
Maya Bechler-Speicher
Orr Fischer
Ran Gilad-Bachrach
Amir Globerson
58
0
0
03 Mar 2025
Compositional Reasoning with Transformers, RNNs, and Chain of Thought
Gilad Yehudai
Noah Amsel
Joan Bruna
LRM
60
1
0
03 Mar 2025
Real-Time Personalization with Simple Transformers
Lin An
Andrew A. Li
Vaisnavi Nemala
Gabriel Visotsky
34
0
0
01 Mar 2025
On the Robustness of Transformers against Context Hijacking for Linear Classification
Tianle Li
Chenyang Zhang
Xingwu Chen
Yuan Cao
Difan Zou
72
0
0
24 Feb 2025
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao Song
Yufa Zhou
101
18
0
21 Feb 2025
Cross-Domain Continual Learning for Edge Intelligence in Wireless ISAC Networks
Jingzhi Hu
Xin Li
Zhou Su
Jun Luo
70
0
0
18 Feb 2025
Solving Empirical Bayes via Transformers
Anzo Teh
Mark Jabbour
Yury Polyanskiy
93
0
0
17 Feb 2025
Transformers versus the EM Algorithm in Multi-class Clustering
Yihan He
Hong-Yu Chen
Yuan Cao
Jianqing Fan
Han Liu
55
0
0
09 Feb 2025
Exact Sequence Classification with Hardmax Transformers
Albert Alcalde
Giovanni Fantuzzi
Enrique Zuazua
77
1
0
04 Feb 2025
Distribution Transformers: Fast Approximate Bayesian Inference With On-The-Fly Prior Adaptation
George Whittle
Juliusz Ziomek
Jacob Rawling
Michael A. Osborne
97
2
0
04 Feb 2025
Emergent Stack Representations in Modeling Counter Languages Using Transformers
Utkarsh Tiwari
Aviral Gupta
Michael Hahn
190
0
0
03 Feb 2025
Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation
Yang Cao
Zhao Song
Chiwun Yang
VGen
46
2
0
01 Feb 2025
Token Democracy: The Architectural Limits of Alignment in Transformer-Based Language Models
Robin Young
49
0
0
28 Jan 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Yutong Yin
Zhaoran Wang
LRM
ReLM
167
0
0
27 Jan 2025
Approximation Rate of the Transformer Architecture for Sequence Modeling
Hao Jiang
Qianxiao Li
48
9
0
03 Jan 2025
Learning Elementary Cellular Automata with Transformers
Mikhail Burtsev
83
1
0
02 Dec 2024
Understanding Scaling Laws with Statistical and Approximation Theory for Transformer Neural Networks on Intrinsically Low-dimensional Data
Alex Havrilla
Wenjing Liao
36
8
0
11 Nov 2024
How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Guhao Feng
Kai-Bo Yang
Yuntian Gu
Xinyue Ai
Shengjie Luo
Jiacheng Sun
Di He
Zechao Li
Liwei Wang
LRM
37
6
0
17 Oct 2024
TULIP: Token-length Upgraded CLIP
Ivona Najdenkoska
Mohammad Mahdi Derakhshani
Yuki M. Asano
Nanne van Noord
Marcel Worring
Cees G. M. Snoek
VLM
48
3
0
13 Oct 2024
Generalizable autoregressive modeling of time series through functional narratives
Ran Liu
Wenrui Ma
Ellen L. Zippi
Hadi Pouransari
Jingyun Xiao
...
Behrooz Mahasseni
Juri Minxha
Erdrin Azemi
Eva L. Dyer
Ali Moin
AI4TS
35
1
0
10 Oct 2024
Identification of Mean-Field Dynamics using Transformers
Shiba Biswal
Karthik Elamvazhuthi
Rishi Sonthalia
AI4CE
27
1
0
06 Oct 2024
Towards Understanding the Universality of Transformers for Next-Token Prediction
Michael E. Sander
Gabriel Peyré
CML
39
0
0
03 Oct 2024
ENTP: Encoder-only Next Token Prediction
Ethan Ewer
Daewon Chae
Thomas Zeng
Jinkyu Kim
Kangwook Lee
38
3
0
02 Oct 2024
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
Kevin Xu
Issei Sato
39
3
0
02 Oct 2024
Efficient Data Subset Selection to Generalize Training Across Models: Transductive and Inductive Networks
Eeshaan Jain
Tushar Nandy
Gaurav Aggarwal
Ashish Tendulkar
Rishabh K. Iyer
A. De
43
11
0
18 Sep 2024
D2Vformer: A Flexible Time Series Prediction Model Based on Time Position Embedding
Xiaobao Song
Hao Wang
Liwei Deng
Yuxin He
Wenming Cao
Chi-Sing Leungc
AI4TS
35
0
0
17 Sep 2024
Differentially Private Kernel Density Estimation
Erzhi Liu
Jerry Yao-Chieh Hu
Alex Reneau
Zhao Song
Han Liu
66
3
0
03 Sep 2024
Universal Approximation of Operators with Transformers and Neural Integral Operators
E. Zappala
Maryam Bagherian
16
1
0
01 Sep 2024
Partial-Multivariate Model for Forecasting
Jaehoon Lee
Hankook Lee
Sungik Choi
Sungjun Cho
Moontae Lee
AI4TS
44
0
0
19 Aug 2024
Sampling Foundational Transformer: A Theoretical Perspective
Viet Anh Nguyen
Minh Lenhat
Khoa Nguyen
Duong Duc Hieu
Dao Huu Hung
Truong-Son Hy
46
0
0
11 Aug 2024
Transformers are Universal In-context Learners
Takashi Furuya
Maarten V. de Hoop
Gabriel Peyré
45
6
0
02 Aug 2024
What Are Good Positional Encodings for Directed Graphs?
Yinan Huang
Haoyu Wang
Pan Li
31
2
0
30 Jul 2024
Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks
Xingcheng Xu
Zibo Zhao
Haipeng Zhang
Yanqing Yang
LRM
44
0
0
25 Jul 2024
Transformers on Markov Data: Constant Depth Suffices
Nived Rajaraman
Marco Bondaschi
Kannan Ramchandran
Michael C. Gastpar
Ashok Vardhan Makkuva
48
4
0
25 Jul 2024
1
2
3
4
5
Next