ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.24531
  4. Cited By
Transformers Are Universally Consistent

Transformers Are Universally Consistent

30 May 2025
Sagar Ghosh
Kushal Bose
Swagatam Das
ArXivPDFHTML

Papers citing "Transformers Are Universally Consistent"

24 / 24 papers shown
Title
On the Universal Statistical Consistency of Expansive Hyperbolic Deep
  Convolutional Neural Networks
On the Universal Statistical Consistency of Expansive Hyperbolic Deep Convolutional Neural Networks
Sagar Ghosh
Kushal Bose
Swagatam Das
25
1
0
15 Nov 2024
A Survey of Controllable Text Generation using Transformer-based
  Pre-trained Language Models
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
69
219
0
14 Jan 2022
Universal Consistency of Deep Convolutional Neural Networks
Universal Consistency of Deep Convolutional Neural Networks
Shao-Bo Lin
Kaidong Wang
Yao Wang
Ding-Xuan Zhou
34
23
0
23 Jun 2021
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
92
1,101
0
08 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
77
1,608
0
02 Jun 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
403
40,217
0
22 Oct 2020
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
484
2,051
0
28 Jul 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
190
3,082
0
16 May 2020
Longformer: The Long-Document Transformer
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
100
3,996
0
10 Apr 2020
Sparse Sinkhorn Attention
Sparse Sinkhorn Attention
Yi Tay
Dara Bahri
Liu Yang
Donald Metzler
Da-Cheng Juan
63
336
0
26 Feb 2020
Are Transformers universal approximators of sequence-to-sequence
  functions?
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
83
347
0
20 Dec 2019
Axial Attention in Multidimensional Transformers
Axial Attention in Multidimensional Transformers
Jonathan Ho
Nal Kalchbrenner
Dirk Weissenborn
Tim Salimans
81
525
0
20 Dec 2019
Improving Multi-Head Attention with Capsule Networks
Improving Multi-Head Attention with Capsule Networks
Shuhao Gu
Yang Feng
42
13
0
31 Aug 2019
What Does BERT Look At? An Analysis of BERT's Attention
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
186
1,586
0
11 Jun 2019
Learning Deep Transformer Models for Machine Translation
Learning Deep Transformer Models for Machine Translation
Qiang Wang
Bei Li
Tong Xiao
Jingbo Zhu
Changliang Li
Derek F. Wong
Lidia S. Chao
59
666
0
05 Jun 2019
Generating Long Sequences with Sparse Transformers
Generating Long Sequences with Sparse Transformers
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
76
1,880
0
23 Apr 2019
Universal approximations of permutation invariant/equivariant functions
  by deep neural networks
Universal approximations of permutation invariant/equivariant functions by deep neural networks
Akiyoshi Sannai
Yuuki Takai
Matthieu Cordonnier
43
68
0
05 Mar 2019
Star-Transformer
Star-Transformer
Qipeng Guo
Xipeng Qiu
Pengfei Liu
Yunfan Shao
Xiangyang Xue
Zheng Zhang
58
264
0
25 Feb 2019
Molecular Transformer - A Model for Uncertainty-Calibrated Chemical
  Reaction Prediction
Molecular Transformer - A Model for Uncertainty-Calibrated Chemical Reaction Prediction
P. Schwaller
Teodoro Laino
John McGuinness
A. Horváth
Constantine Bekas
A. Lee
91
730
0
06 Nov 2018
ResNet with one-neuron hidden layers is a Universal Approximator
ResNet with one-neuron hidden layers is a Universal Approximator
Hongzhou Lin
Stefanie Jegelka
65
227
0
28 Jun 2018
Universality of Deep Convolutional Neural Networks
Universality of Deep Convolutional Neural Networks
Ding-Xuan Zhou
HAI
PINN
238
514
0
28 May 2018
The Expressive Power of Neural Networks: A View from the Width
The Expressive Power of Neural Networks: A View from the Width
Zhou Lu
Hongming Pu
Feicheng Wang
Zhiqiang Hu
Liwei Wang
67
886
0
08 Sep 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
453
129,831
0
12 Jun 2017
Nearly-tight VC-dimension and pseudodimension bounds for piecewise
  linear neural networks
Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks
Peter L. Bartlett
Nick Harvey
Christopher Liaw
Abbas Mehrabian
142
427
0
08 Mar 2017
1