A Survey of Transformers

8 June 2021

Tianyang Lin

Yuxin Wang

Xiangyang Liu

Xipeng Qiu

ViT

ArXiv PDF HTML

Papers citing "A Survey of Transformers"

50 / 347 papers shown

Title
Towards smaller, faster decoder-only transformers: Architectural variants and their implications Sathya Krishnan Suresh P. Shunmugapriya 24 0 0 22 Apr 2024
Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda Johannes Schneider 83 26 0 15 Apr 2024
Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models Zhengcong Fei Mingyuan Fan Changqian Yu Debang Li Junshi Huang 40 24 0 06 Apr 2024
Vision Transformers in Domain Adaptation and Generalization: A Study of Robustness Shadi Alijani Jamil Fayyad H. Najjaran OOD 32 1 0 05 Apr 2024
FlightScope: A Deep Comprehensive Assessment of Aircraft Detection Algorithms in Satellite Imagery Safouane El Ghazouali Arnaud Gucciardi Nicola Venturi Michael Rueegsegger Umberto Michelucci 20 1 0 03 Apr 2024
Optimizing the Deployment of Tiny Transformers on Low-Power MCUs Victor J. B. Jung Alessio Burrello Moritz Scherer Francesco Conti Luca Benini 30 4 0 03 Apr 2024
Using Large Language Models to Understand Telecom Standards Athanasios Karapantelakis Mukesh Shakur Alexandros Nikou Farnaz Moradi Christian Orlog Fitsum Gaim Henrik Holm Doumitrou Daniil Nimara Vincent Huang 33 13 0 02 Apr 2024
On permutation-invariant neural networks Masanari Kimura Ryotaro Shimizu Yuki Hirakawa Ryosuke Goto Yuki Saito OOD AAML 41 12 0 26 Mar 2024
VCR-Graphormer: A Mini-batch Graph Transformer via Virtual Connections Dongqi Fu Zhigang Hua Yan Xie Jin Fang Si Zhang Kaan Sancak Hao Wu Andrey Malevich Jingrui He Bo Long 42 19 0 24 Mar 2024
Training point-based deep learning networks for forest segmentation with synthetic data Francisco Raverta Capua Juan Schandin Pablo De Cristoforis 3DPC 38 3 0 21 Mar 2024
Multi-Dimensional Machine Translation Evaluation: Model Evaluation and Resource for Korean Dojun Park Sebastian Padó 37 1 0 19 Mar 2024
Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices Sara Abdali Richard Anarfi C. Barberan Jia He PILM 70 24 0 19 Mar 2024
From Explainable to Interpretable Deep Learning for Natural Language Processing in Healthcare: How Far from Reality? Guangming Huang Yingya Li Shoaib Jameel Yunfei Long G. Papanastasiou 36 16 0 18 Mar 2024
Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding Tatsunori Taniai Ryo Igarashi Yuta Suzuki Naoya Chiba Kotaro Saito Yoshitaka Ushiku K. Ono 46 8 0 18 Mar 2024
VIHE: Virtual In-Hand Eye Transformer for 3D Robotic Manipulation Weiyao Wang Yutian Lei Shiyu Jin Gregory D. Hager Liangjun Zhang 36 2 0 18 Mar 2024
TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax Tobias Christian Nauen Sebastián M. Palacio Andreas Dengel 54 3 0 05 Mar 2024
NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function Abdullah Nazhat Abdullah Tarkan Aydin 39 0 0 04 Mar 2024
Boosting gets full Attention for Relational Learning Mathieu Guillame-Bert Richard Nock LMTD 33 0 0 22 Feb 2024
MC-DBN: A Deep Belief Network-Based Model for Modality Completion Zihong Luo Zheng Tao Yuxuan Huang Kexin He Chengzhi Liu 11 2 0 15 Feb 2024
Model Compression and Efficient Inference for Large Language Models: A Survey Wenxiao Wang Wei Chen Yicong Luo Yongliu Long Zhengkai Lin Liye Zhang Binbin Lin Deng Cai Xiaofei He MQ 41 48 0 15 Feb 2024
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference Harry Dong Xinyu Yang Zhenyu (Allen) Zhang Zhangyang Wang Yuejie Chi Beidi Chen 32 49 0 14 Feb 2024
Depth-aware Volume Attention for Texture-less Stereo Matching Tong Zhao Mingyu Ding Wei Zhan Masayoshi Tomizuka Y. Wei 3DV MDE 36 4 0 14 Feb 2024
Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven Transformer Mingxuan Liu Jiankai Tang Haoxiang Li Jiahao Qi Siwei Li Kegang Wang Yuntao wang Hong Chen Yuntao Wang Hong Chen 91 14 0 07 Feb 2024
Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining Jiarun Liu Hao Yang Hong-Yu Zhou Yan Xi Lequan Yu ... Yong Liang Guangming Shi Shaoting Zhang Hairong Zheng Shanshan Wang Mamba 53 143 0 05 Feb 2024
Exploring the Synergies of Hybrid CNNs and ViTs Architectures for Computer Vision: A survey Haruna Yunusa Shiyin Qin Abdulrahman Hamman Adama Chukkol Abdulganiyu Abdu Yusuf Isah Bello A. Lawan ViT 43 13 0 05 Feb 2024
CascadedGaze: Efficiency in Global Context Extraction for Image Restoration Amirhosein Ghasemabadi Muhammad Kamran Janjua Mohammad Salameh Chunhua Zhou Fengyu Sun Di Niu 35 11 0 26 Jan 2024
Do deep neural networks utilize the weight space efficiently? Onur Can Koyun B. U. Toreyin 18 0 0 26 Jan 2024
Evaluation of LLM Chatbots for OSINT-based Cyber Threat Awareness Samaneh Shafee A. Bessani Pedro M. Ferreira 31 19 0 26 Jan 2024
RefreshNet: Learning Multiscale Dynamics through Hierarchical Refreshing Junaid Farooq Danish Rafiq Pantelis R. Vlachas M. A. Bazaz 29 0 0 24 Jan 2024
Towards Trustable Language Models: Investigating Information Quality of Large Language Models Rick Rejeleene Xiaowei Xu John R. Talburt HILM 26 2 0 23 Jan 2024
SymTC: A Symbiotic Transformer-CNN Net for Instance Segmentation of Lumbar Spine MRI Jiasong Chen Linchen Qian Linhai Ma Timur Urakov Weiyong Gu Liang Liang MedIm 39 4 0 17 Jan 2024
Representation Learning of Multivariate Time Series using Attention and Adversarial Training Leon Scharwächter Sebastian Otte OOD AI4TS 22 0 0 03 Jan 2024
Boosting Transformer's Robustness and Efficacy in PPG Signal Artifact Detection with Self-Supervised Learning Thanh-Dung Le 34 1 0 02 Jan 2024
Early and Accurate Detection of Tomato Leaf Diseases Using TomFormer Asim Khan Umair Nawaz K. Lochan Lakmal D. Seneviratne Irfan Hussain MedIm 25 4 0 26 Dec 2023
The Media Bias Taxonomy: A Systematic Literature Review on the Forms and Automated Detection of Media Bias Timo Spinde Smilla Hinterreiter Fabian Haak Terry Ruas Helge Giese Norman Meuschke Bela Gipp 19 12 0 26 Dec 2023
A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models Harsh Sharma Pratyush Dhingra J. Doppa Ümit Y. Ogras P. Pande 34 7 0 18 Dec 2023
A mathematical perspective on Transformers Borjan Geshkovski Cyril Letrouit Yury Polyanskiy Philippe Rigollet EDL AI4CE 42 36 0 17 Dec 2023
Empowering ChatGPT-Like Large-Scale Language Models with Local Knowledge Base for Industrial Prognostics and Health Management Huan Wang Yan-Fu Li Min Xie LM&MA AI4MH AI4CE 19 4 0 06 Dec 2023
The Landscape of Modern Machine Learning: A Review of Machine, Distributed and Federated Learning Omer Subasi Oceane Bel Joseph Manzano Kevin J. Barker FedML OOD PINN 25 2 0 05 Dec 2023
Exploring the Temperature-Dependent Phase Transition in Modern Hopfield Networks Felix Koulischer Cédric Goemaere Tom van der Meersch Johannes Deleu Thomas Demeester 26 0 0 30 Nov 2023
Improving the Robustness of Transformer-based Large Language Models with Dynamic Attention Lujia Shen Yuwen Pu Shouling Ji Changjiang Li Xuhong Zhang Chunpeng Ge Ting Wang AAML 26 3 0 29 Nov 2023
An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification Prakhar Ganesh 41 5 0 24 Nov 2023
Bitformer: An efficient Transformer with bitwise operation-based attention for Big Data Analytics at low-cost low-precision devices Gaoxiang Duan Junkai Zhang Xiaoying Zheng Yongxin Zhu 36 2 0 22 Nov 2023
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey Yunpeng Huang Jingwei Xu Junyu Lai Zixu Jiang Taolue Chen ... Xiaoxing Ma Lijuan Yang Zhou Xin Shupeng Li Penghao Zhao LLMAG KELM 36 54 0 21 Nov 2023
Efficient Neural Networks for Tiny Machine Learning: A Comprehensive Review M. Lê Pierre Wolinski Julyan Arbel 34 8 0 20 Nov 2023
Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers Staphord Bengesi Hoda El-Sayed Md Kamruzzaman Sarker Yao Houkpati John Irungu T. Oladunni 50 74 0 17 Nov 2023
To Transformers and Beyond: Large Language Models for the Genome Micaela Elisa Consens Cameron Dufault Michael Wainberg Duncan Forster Mehran Karimzadeh Hani Goodarzi Fabian J. Theis Alan Moses Bo Wang LM&MA MedIm 23 26 0 13 Nov 2023
Learning Human Action Recognition Representations Without Real Humans Howard Zhong Samarth Mishra Donghyun Kim SouYoung Jin Rameswar Panda Hildegard Kuehne Leonid Karlinsky Venkatesh Saligrama Aude Oliva Rogerio Feris 24 3 0 10 Nov 2023
UAV Trajectory Planning for AoI-Minimal Data Collection in UAV-Aided IoT Networks by Transformer Botao Zhu E. Bedeer Ha H. Nguyen Robert Barton Zhen Gao 35 79 0 08 Nov 2023
OmniVec: Learning robust representations with cross modal sharing Siddharth Srivastava Gaurav Sharma SSL 29 64 0 07 Nov 2023