Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.09503
Cited By
AttentionSmithy: A Modular Framework for Rapid Transformer Development and Customization
13 February 2025
Caleb Cranney
Jesse G. Meyer
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AttentionSmithy: A Modular Framework for Rapid Transformer Development and Customization"
14 / 14 papers shown
Title
A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships
Gracile Astlin Pereira
Muhammad Hussain
ViT
59
9
0
27 Aug 2024
Exploring ChatGPT and its Impact on Society
Md. Asraful Haque
Shuai Li
SILM
68
27
0
21 Feb 2024
Transformer models: an introduction and catalog
X. Amatriain
Ananth Sankar
Jie Bing
Praveen Kumar Bodigutla
Timothy J. Hazen
Michaeel Kazi
62
53
0
12 Feb 2023
Neural Architecture Search on Efficient Transformers and Beyond
Zexiang Liu
Dong Li
Kaiyue Lu
Zhen Qin
Weixuan Sun
Jiacheng Xu
Yiran Zhong
57
20
0
28 Jul 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
181
365
0
21 May 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
324
759
0
27 Aug 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
275
2,453
0
20 Apr 2021
What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding
Yu-An Wang
Yun-Nung Chen
SSL
46
95
0
10 Oct 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
543
2,086
0
28 Jul 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
210
1,706
0
08 Jun 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
168
4,071
0
10 Apr 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
419
20,181
0
23 Oct 2019
Self-Attention with Relative Position Representations
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
174
2,290
0
06 Mar 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
701
131,652
0
12 Jun 2017
1