ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need
v1v2v3v4v5v6v7 (latest)

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXiv (abs)PDFHTML

Papers citing "Attention Is All You Need"

50 / 2,193 papers shown
Title
Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System
Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System
Saikat Barua
Mostafizur Rahman
Md Jafor Sadek
Rafiul Islam
Shehnaz Khaled
Ahmedul Kabir
LLMAG
141
1
0
23 Feb 2025
GS-TransUNet: Integrated 2D Gaussian Splatting and Transformer UNet for Accurate Skin Lesion Analysis
GS-TransUNet: Integrated 2D Gaussian Splatting and Transformer UNet for Accurate Skin Lesion Analysis
Anand Kumar
Kavinder Roghit Kanthen
Josna John
3DGS
167
0
0
23 Feb 2025
Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification
Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification
Arshia Kermani
Ehsan Zeraatkar
Habib Irani
119
3
0
23 Feb 2025
TimePFN: Effective Multivariate Time Series Forecasting with Synthetic Data
TimePFN: Effective Multivariate Time Series Forecasting with Synthetic Data
Ege Onur Taga
M. E. Ildiz
Samet Oymak
AI4TS
127
3
0
22 Feb 2025
Single-Channel EEG Tokenization Through Time-Frequency Modeling
Single-Channel EEG Tokenization Through Time-Frequency Modeling
Jathurshan Pradeepkumar
Xihao Piao
Zheng Chen
Jimeng Sun
100
2
0
22 Feb 2025
Int2Int: a framework for mathematics with transformers
Int2Int: a framework for mathematics with transformers
François Charton
ViT
147
0
0
22 Feb 2025
Large Language Model for Lossless Image Compression with Visual Prompts
Large Language Model for Lossless Image Compression with Visual Prompts
Junhao Du
Chuqin Zhou
Ning Cao
Gang Chen
Yunuo Chen
Zhengxue Cheng
Li Song
Guo Lu
Wenjun Zhang
VLM
86
2
0
22 Feb 2025
Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Jesus Rios
Pierre Dognin
Ronny Luss
Karthikeyan N. Ramamurthy
189
1
0
21 Feb 2025
Enhancing RWKV-based Language Models for Long-Sequence Text Generation
Enhancing RWKV-based Language Models for Long-Sequence Text Generation
Xinghan Pan
119
0
0
21 Feb 2025
Surface Vision Mamba: Leveraging Bidirectional State Space Model for Efficient Spherical Manifold Representation
Surface Vision Mamba: Leveraging Bidirectional State Space Model for Efficient Spherical Manifold Representation
Rongzhao He
Weihao Zheng
Leilei Zhao
Ying Wang
Dalin Zhu
Dan Wu
Bin Hu
Mamba
150
0
0
21 Feb 2025
DReSD: Dense Retrieval for Speculative Decoding
DReSD: Dense Retrieval for Speculative Decoding
Milan Gritta
Huiyin Xue
Gerasimos Lampouras
RALM
192
0
0
21 Feb 2025
Neural Attention Search
Neural Attention Search
Difan Deng
Marius Lindauer
137
0
0
21 Feb 2025
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hojoon Lee
Youngdo Lee
Takuma Seno
Donghu Kim
Peter Stone
Jaegul Choo
171
4
0
21 Feb 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
193
3
0
21 Feb 2025
MambaLiteSR: Image Super-Resolution with Low-Rank Mamba using Knowledge Distillation
MambaLiteSR: Image Super-Resolution with Low-Rank Mamba using Knowledge Distillation
Romina Aalishah
Mozhgan Navardi
T. Mohsenin
Mamba
135
0
0
21 Feb 2025
Multi-Agent Stock Prediction Systems: Machine Learning Models, Simulations, and Real-Time Trading Strategies
Daksh Dave
Gauransh Sawhney
Vikhyat Chauhan
AIFin
107
0
0
21 Feb 2025
Deterministic Reversible Data Augmentation for Neural Machine Translation
Deterministic Reversible Data Augmentation for Neural Machine Translation
Jiashu Yao
Heyan Huang
Zeming Liu
Yuhang Guo
153
0
0
21 Feb 2025
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao Song
Yufa Zhou
160
19
0
21 Feb 2025
Lightweight yet Efficient: An External Attentive Graph Convolutional Network with Positional Prompts for Sequential Recommendation
Lightweight yet Efficient: An External Attentive Graph Convolutional Network with Positional Prompts for Sequential Recommendation
Jinyu Zhang
Chao Li
Zhongying Zhao
132
1
0
21 Feb 2025
Optimal word order for non-causal text generation with Large Language Models: the Spanish case
Optimal word order for non-causal text generation with Large Language Models: the Spanish case
Andrea Busto-Castiñeira
Silvia García-Méndez
Francisco de Arriba-Pérez
Francisco J. González Castaño
71
0
0
21 Feb 2025
On Memorization in Diffusion Models
On Memorization in Diffusion Models
Xiangming Gu
Chao Du
Tianyu Pang
Chongxuan Li
Min Lin
Ye Wang
DiffMTDI
337
55
0
21 Feb 2025
Utilizing Sequential Information of General Lab-test Results and Diagnoses History for Differential Diagnosis of Dementia
Utilizing Sequential Information of General Lab-test Results and Diagnoses History for Differential Diagnosis of Dementia
Yizong Xing
Dhita Putri Pratama
Yuke Wang
Yufan Zhang
Brian E. Chapman
109
0
0
21 Feb 2025
A Survey of Model Architectures in Information Retrieval
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM3DV
162
2
0
21 Feb 2025
Repetition Neurons: How Do Language Models Produce Repetitions?
Repetition Neurons: How Do Language Models Produce Repetitions?
Tatsuya Hiraoka
Kentaro Inui
MILM
118
9
0
21 Feb 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
273
52
0
21 Feb 2025
SegAug: CTC-Aligned Segmented Augmentation For Robust RNN-Transducer Based Speech Recognition
SegAug: CTC-Aligned Segmented Augmentation For Robust RNN-Transducer Based Speech Recognition
Khanh Le
Tuan Vu Ho
Dung Tran
Duc Thanh Chau
96
0
0
20 Feb 2025
Uncertainty Representations in State-Space Layers for Deep Reinforcement Learning under Partial Observability
Uncertainty Representations in State-Space Layers for Deep Reinforcement Learning under Partial Observability
Carlos E. Luis
A. Bottero
Julia Vinogradska
Felix Berkenkamp
Jan Peters
219
1
0
20 Feb 2025
NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance
NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance
Raphael T. Husistein
Markus Reiher
Marco Eckhoff
244
1
0
20 Feb 2025
Stacking as Accelerated Gradient Descent
Stacking as Accelerated Gradient Descent
Naman Agarwal
Pranjal Awasthi
Satyen Kale
Eric Zhao
ODL
123
3
0
20 Feb 2025
Synthetic Tabular Data Generation for Imbalanced Classification: The Surprising Effectiveness of an Overlap Class
Synthetic Tabular Data Generation for Imbalanced Classification: The Surprising Effectiveness of an Overlap Class
Annie D'souza
Swetha M
Sunita Sarawagi
136
1
0
20 Feb 2025
Exploring Mutual Cross-Modal Attention for Context-Aware Human Affordance Generation
Exploring Mutual Cross-Modal Attention for Context-Aware Human Affordance Generation
Prasun Roy
Saumik Bhattacharya
Subhankar Ghosh
Umapada Pal
Michael Blumenstein
107
0
0
20 Feb 2025
Myna: Masking-Based Contrastive Learning of Musical Representations
Myna: Masking-Based Contrastive Learning of Musical Representations
Ori Yonay
Tracy Hammond
Tianbao Yang
AAML
214
0
0
20 Feb 2025
Swarm Characteristics Classification Using Neural Networks
Swarm Characteristics Classification Using Neural Networks
Donald W. Peltier
Isaac Kaminer
Abram H. Clark
Marko Orescanin
61
1
0
20 Feb 2025
LESA: Learnable LLM Layer Scaling-Up
LESA: Learnable LLM Layer Scaling-Up
Yifei Yang
Zouying Cao
Xinbei Ma
Yao Yao
L. Qin
Zhongfu Chen
Hai Zhao
146
0
0
20 Feb 2025
LabTOP: A Unified Model for Lab Test Outcome Prediction on Electronic Health Records
LabTOP: A Unified Model for Lab Test Outcome Prediction on Electronic Health Records
Sujeong Im
Jungwoo Oh
Edward Choi
BDLLM&MA
92
0
0
20 Feb 2025
Synthetic generation of 2D data records based on Autoencoders
Synthetic generation of 2D data records based on Autoencoders
Darius Couchard
Oscar Olarte
Rob Haelterman
86
0
0
20 Feb 2025
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Luca Barsellotti
Roberto Bigazzi
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
205
1
0
20 Feb 2025
A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond
A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond
Shreya Shukla
Jose Torres
Abhijit Mishra
Jacek Gwizdka
Shounak Roychowdhury
102
0
0
20 Feb 2025
Animate Your Thoughts: Decoupled Reconstruction of Dynamic Natural Vision from Slow Brain Activity
Animate Your Thoughts: Decoupled Reconstruction of Dynamic Natural Vision from Slow Brain Activity
Yizhuo Lu
Changde Du
Chong Wang
Xuanliu Zhu
Liuyun Jiang
Xujin Li
Huiguang He
VGen
222
4
0
20 Feb 2025
Autograding Mathematical Induction Proofs with Natural Language Processing
Autograding Mathematical Induction Proofs with Natural Language Processing
Chenyan Zhao
Mariana Silva
Seth Poulsen
AIMat
126
2
0
20 Feb 2025
Towards Physics-Guided Foundation Models
Towards Physics-Guided Foundation Models
Majid Farhadloo
Arun Sharma
Mingzhou Yang
B. Jayaprakash
W. Northrop
Shashi Shekhar
AI4CE
83
0
0
20 Feb 2025
X-IL: Exploring the Design Space of Imitation Learning Policies
X-IL: Exploring the Design Space of Imitation Learning Policies
Xiaogang Jia
Atalay Donat
Xi Huang
Xuan Zhao
Denis Blessing
...
Han A. Wang
Hanyi Zhang
Qian Wang
Rudolf Lioutikov
Gerhard Neumann
141
1
0
20 Feb 2025
Tabular Embeddings for Tables with Bi-Dimensional Hierarchical Metadata and Nesting
Tabular Embeddings for Tables with Bi-Dimensional Hierarchical Metadata and Nesting
Gyanendra Shrestha
Chutain Jiang
Sai Akula
Vivek Yannam
Anna Pyayt
Michael Gubanov
LMTD
143
0
0
20 Feb 2025
TabSD: Large Free-Form Table Question Answering with SQL-Based Table Decomposition
TabSD: Large Free-Form Table Question Answering with SQL-Based Table Decomposition
Yuxiang Wang
Junhao Gan
Jianzhong Qi
LMTD
140
0
0
20 Feb 2025
Event-Based Video Frame Interpolation With Cross-Modal Asymmetric Bidirectional Motion Fields
Event-Based Video Frame Interpolation With Cross-Modal Asymmetric Bidirectional Motion Fields
Taewoo Kim
Yujeong Chae
Hyun-Kurl Jang
Kuk-Jin Yoon
146
34
0
20 Feb 2025
Towards Active Participant Centric Vertical Federated Learning: Some Representations May Be All You Need
Towards Active Participant Centric Vertical Federated Learning: Some Representations May Be All You Need
Jon Irureta
Jon Imaz
Aizea Lojo
Javier Fernandez-Marques
Marco González
Iñigo Perona
FedML
123
1
0
20 Feb 2025
BFA: Best-Feature-Aware Fusion for Multi-View Fine-grained Manipulation
BFA: Best-Feature-Aware Fusion for Multi-View Fine-grained Manipulation
Zihan Lan
Weixin Mao
Haoyang Li
Le Wang
Tiancai Wang
Haoqiang Fan
Osamu Yoshie
EgoV
105
2
0
20 Feb 2025
FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference
FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference
Bingzhe Zhao
Ke Cheng
Aomufei Yuan
Yuxuan Tian
Ruiguang Zhong
Chengchen Hu
Tong Yang
Lian Yu
111
0
0
19 Feb 2025
Quantifying Memorization and Parametric Response Rates in Retrieval-Augmented Vision-Language Models
Quantifying Memorization and Parametric Response Rates in Retrieval-Augmented Vision-Language Models
Peter Carragher
Abhinand Jha
R Raghav
Kathleen M. Carley
RALM
116
0
0
19 Feb 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu Cheng
KELM
145
5
0
19 Feb 2025
Previous
123...111213...424344
Next