ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need
v1v2v3v4v5v6v7 (latest)

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXiv (abs)PDFHTML

Papers citing "Attention Is All You Need"

50 / 27,180 papers shown
Title
Detecting Hard-Coded Credentials in Software Repositories via LLMs
Detecting Hard-Coded Credentials in Software Repositories via LLMs
Chidera Biringa
Gökhan Kul
23
0
0
16 Jun 2025
Prefix-Tuning+: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention
Prefix-Tuning+: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention
Haonan Wang
Brian K Chen
Siquan Li
Xinhe Liang
Hwee Kuan Lee
Kenji Kawaguchi
Tianyang Hu
21
0
0
16 Jun 2025
Gated Rotary-Enhanced Linear Attention for Long-term Sequential Recommendation
Gated Rotary-Enhanced Linear Attention for Long-term Sequential Recommendation
Juntao Hu
Wei Zhou
Huayi Shen
Xiao Du
Jie Liao
Junhao Wen
Min Gao
20
0
0
16 Jun 2025
Forecast-Then-Optimize Deep Learning Methods
Forecast-Then-Optimize Deep Learning Methods
Jinhang Jiang
Nan Wu
Ben Liu
Mei Feng
Xin Ji
Karthik Srinivasan
AI4TS
17
0
0
16 Jun 2025
Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization
Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization
Guanghui Song
Dongping Liao
Yiren Zhao
Kejiang Ye
Cheng-zhong Xu
X. Gao
MoE
14
0
0
16 Jun 2025
Symmetry in Neural Network Parameter Spaces
Symmetry in Neural Network Parameter Spaces
Bo Zhao
Robin Walters
Rose Yu
22
0
0
16 Jun 2025
HELENA: High-Efficiency Learning-based channel Estimation using dual Neural Attention
HELENA: High-Efficiency Learning-based channel Estimation using dual Neural Attention
Miguel Camelo Botero
Esra Aycan Beyazit
Nina Slamnik-Kriještorac
Johann M. Marquez-Barja
7
0
0
16 Jun 2025
Action Dubber: Timing Audible Actions via Inflectional Flow
Action Dubber: Timing Audible Actions via Inflectional Flow
Wenlong Wan
Weiying Zheng
Tianyi Xiang
Guiqing Li
Shengfeng He
22
0
0
16 Jun 2025
Dynamic Acoustic Model Architecture Optimization in Training for ASR
Dynamic Acoustic Model Architecture Optimization in Training for ASR
Jingjing Xu
Zijian Yang
Albert Zeyer
Eugen Beck
Ralf Schlueter
Hermann Ney
9
0
0
16 Jun 2025
HierVL: Semi-Supervised Segmentation leveraging Hierarchical Vision-Language Synergy with Dynamic Text-Spatial Query Alignment
HierVL: Semi-Supervised Segmentation leveraging Hierarchical Vision-Language Synergy with Dynamic Text-Spatial Query Alignment
Numair Nadeem
Saeed Anwar
Muhammad Asad
Abdul Bais
VLM
22
0
0
16 Jun 2025
Equitable Electronic Health Record Prediction with FAME: Fairness-Aware Multimodal Embedding
Equitable Electronic Health Record Prediction with FAME: Fairness-Aware Multimodal Embedding
Nikkie Hooman
Zhongjie Wu
Eric C. Larson
Mehak Gupta
20
0
0
16 Jun 2025
EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition
EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition
Bingxi Liu
Hao Chen
Shiyi Guo
Yihong Wu
Jinqiang Cui
Hong Zhang
7
0
0
16 Jun 2025
Mitigating loss of variance in ensemble data assimilation: machine learning-based and distance-free localizations for better covariance estimation
Mitigating loss of variance in ensemble data assimilation: machine learning-based and distance-free localizations for better covariance estimation
Vinicius L. S. Silva
Gabriel S. Seabra
Alexandre A. Emerick
15
0
0
16 Jun 2025
Probing Deep into Temporal Profile Makes the Infrared Small Target Detector Much Better
Probing Deep into Temporal Profile Makes the Infrared Small Target Detector Much Better
Ruojing Li
Wei An
Xinyi Ying
Yingqian Wang
Yimian Dai
Longguang Wang
Miao Li
Y. Guo
Li Liu
19
0
0
15 Jun 2025
iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer
iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer
Zhelun Shen
Chenming Wu
Junsheng Zhou
Chen Zhao
Kaisiyuan Wang
Hang Zhou
Yingying Li
Haocheng Feng
Wei He
Jingdong Wang
DiffM
20
0
0
15 Jun 2025
SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition
SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition
Yuta Hirano
Sakriani Sakti
9
0
0
15 Jun 2025
Assessing the Role of Data Quality in Training Bilingual Language Models
Assessing the Role of Data Quality in Training Bilingual Language Models
Skyler Seto
Maartje ter Hoeve
Maureen de Seyssel
David Grangier
15
0
0
15 Jun 2025
Mastering Da Vinci Code: A Comparative Study of Transformer, LLM, and PPO-based Agents
Mastering Da Vinci Code: A Comparative Study of Transformer, LLM, and PPO-based Agents
LeCheng Zhang
Yuanshi Wang
Haotian Shen
Xujie Wang
LLMAG
18
0
0
15 Jun 2025
Boundary-Aware Vision Transformer for Angiography Vascular Network Segmentation
Boundary-Aware Vision Transformer for Angiography Vascular Network Segmentation
Nabil Hezil
Suraj Singh
Vita V. Vlasova
Oleg Y. Rogov
Ahmed Bouridane
R. Hamoudi
ViTMedIm
11
0
0
15 Jun 2025
Unleashing Diffusion and State Space Models for Medical Image Segmentation
Unleashing Diffusion and State Space Models for Medical Image Segmentation
Rong Wu
Ziqi Chen
Liming Zhong
Heng Li
Hai Shu
MedIm
23
0
0
15 Jun 2025
TrojanTO: Action-Level Backdoor Attacks against Trajectory Optimization Models
TrojanTO: Action-Level Backdoor Attacks against Trajectory Optimization Models
Yang Dai
Oubo Ma
Longfei Zhang
Xingxing Liang
Xiaochun Cao
Shouling Ji
J. Zhang
Jincai Huang
Li Shen
19
0
0
15 Jun 2025
Large Language Models Enhanced by Plug and Play Syntactic Knowledge for Aspect-based Sentiment Analysis
Large Language Models Enhanced by Plug and Play Syntactic Knowledge for Aspect-based Sentiment Analysis
Yuanhe Tian
Xu Li
Wei Wang
Guoqing Jin
Pengsen Cheng
Yan Song
KELM
28
0
0
15 Jun 2025
Universal Jailbreak Suffixes Are Strong Attention Hijackers
Universal Jailbreak Suffixes Are Strong Attention Hijackers
Matan Ben-Tov
Mor Geva
Mahmood Sharif
15
0
0
15 Jun 2025
Combining Self-attention and Dilation Convolutional for Semantic Segmentation of Coal Maceral Groups
Combining Self-attention and Dilation Convolutional for Semantic Segmentation of Coal Maceral Groups
Zhenghao Xi
Zhengnan Lv
Yang Zheng
Xiang Liu
Zhuang Yu
Junran Chen
Jing Hu
Yaqi Liu
DiffM
9
0
0
15 Jun 2025
A Review of the Long Horizon Forecasting Problem in Time Series Analysis
A Review of the Long Horizon Forecasting Problem in Time Series Analysis
Hans Krupakar
Kandappan V A
AI4TS
12
0
0
15 Jun 2025
MetaEformer: Unveiling and Leveraging Meta-patterns for Complex and Dynamic Systems Load Forecasting
MetaEformer: Unveiling and Leveraging Meta-patterns for Complex and Dynamic Systems Load Forecasting
Shaoyuan Huang
Tiancheng Zhang
Zhongtian Zhang
Xiaofei Wang
Lanjun Wang
Xin Wang
AI4TS
10
0
0
15 Jun 2025
The Synthetic Mirror -- Synthetic Data at the Age of Agentic AI
The Synthetic Mirror -- Synthetic Data at the Age of Agentic AI
Marcelle Momha
15
0
0
15 Jun 2025
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
Lowell Weissman
Michael Krumdick
A. Lynn Abbott
35
0
0
15 Jun 2025
Learning Mappings in Mesh-based Simulations
Learning Mappings in Mesh-based Simulations
Shirin Hosseinmardi
Ramin Bostanabad
AI4CE
10
0
0
14 Jun 2025
Advances in LLMs with Focus on Reasoning, Adaptability, Efficiency and Ethics
Advances in LLMs with Focus on Reasoning, Adaptability, Efficiency and Ethics
Asifullah Khan
Muhammad Zaeem Khan
Saleha Jamshed
Sadia Ahmad
Aleesha Zainab
Kaynat Khatib
Faria Bibi
Abdul Rehman
OffRLLRM
18
0
0
14 Jun 2025
Is your batch size the problem? Revisiting the Adam-SGD gap in language modeling
Is your batch size the problem? Revisiting the Adam-SGD gap in language modeling
Teodora Srećković
Jonas Geiping
Antonio Orvieto
MoE
19
0
0
14 Jun 2025
INTERPOS: Interaction Rhythm Guided Positional Morphing for Mobile App Recommender Systems
INTERPOS: Interaction Rhythm Guided Positional Morphing for Mobile App Recommender Systems
M. H. Maqbool
Moghis Fereidouni
Umar Farooq
A.B. Siddique
H. Foroosh
AI4TS
12
0
0
14 Jun 2025
Bridging the Digital Divide: Small Language Models as a Pathway for Physics and Photonics Education in Underdeveloped Regions
Bridging the Digital Divide: Small Language Models as a Pathway for Physics and Photonics Education in Underdeveloped Regions
Asghar Ghorbani
Hanieh Fattahi
9
0
0
14 Jun 2025
ConsistencyChecker: Tree-based Evaluation of LLM Generalization Capabilities
ConsistencyChecker: Tree-based Evaluation of LLM Generalization Capabilities
Zhaochen Hong
Haofei Yu
Jiaxuan You
11
0
0
14 Jun 2025
Between Predictability and Randomness: Seeking Artistic Inspiration from AI Generative Models
Between Predictability and Randomness: Seeking Artistic Inspiration from AI Generative Models
Olga Vechtomova
15
0
0
14 Jun 2025
BSA: Ball Sparse Attention for Large-scale Geometries
BSA: Ball Sparse Attention for Large-scale Geometries
Catalin E. Brita
Hieu Nguyen
Lohithsai Yadala Chanchu
Domonkos Nagy
Maksim Zhdanov
18
0
0
14 Jun 2025
Feature Complementation Architecture for Visual Place Recognition
Feature Complementation Architecture for Visual Place Recognition
Weiwei Wang
Meijia Wang
Haoyi Wang
Wenqiang Guo
Jiapan Guo
Changming Sun
Lingkun Ma
Weichuan Zhang
16
0
0
14 Jun 2025
GrokAlign: Geometric Characterisation and Acceleration of Grokking
GrokAlign: Geometric Characterisation and Acceleration of Grokking
Thomas Walker
Ahmed Imtiaz Humayun
Randall Balestriero
Richard G. Baraniuk
27
0
0
14 Jun 2025
Exploring Cultural Variations in Moral Judgments with Large Language Models
Exploring Cultural Variations in Moral Judgments with Large Language Models
Hadi Mohammadi
Efthymia Papadopoulou
Yasmeen F.S.S. Meijer
Ayoub Bagheri
17
0
0
14 Jun 2025
QiMeng-Attention: SOTA Attention Operator is generated by SOTA Attention Algorithm
QiMeng-Attention: SOTA Attention Operator is generated by SOTA Attention Algorithm
Qirui Zhou
Shaohui Peng
Weiqiang Xiong
Haixin Chen
Yuanbo Wen
...
Ke Gao
Ruizhi Chen
Yanjun Wu
Chen Zhao
Y. Chen
LRM
14
0
0
14 Jun 2025
Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning
Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning
Sara Rajaram
R. J. Cotton
Fabian H. Sinz
12
0
0
14 Jun 2025
Information fusion strategy integrating pre-trained language model and contrastive learning for materials knowledge mining
Information fusion strategy integrating pre-trained language model and contrastive learning for materials knowledge mining
Yongqian Peng
Zhouran Zhang
Longhui Zhang
Fengyuan Zhao
Yahao Li
Yicong Ye
Shuxin Bai
AI4CE
15
0
0
14 Jun 2025
PROTOCOL: Partial Optimal Transport-enhanced Contrastive Learning for Imbalanced Multi-view Clustering
PROTOCOL: Partial Optimal Transport-enhanced Contrastive Learning for Imbalanced Multi-view Clustering
Xuqian Xue
Yiming Lei
Qi Cai
Hongming Shan
Junping Zhang
20
0
0
14 Jun 2025
Efficient Star Distillation Attention Network for Lightweight Image Super-Resolution
Efficient Star Distillation Attention Network for Lightweight Image Super-Resolution
Fangwei Hao
Ji Du
Desheng Kong
Jiesheng Wu
Jing Xu
Ping Li
18
0
0
14 Jun 2025
Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment
Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment
Alejandro Peña
Julian Fierrez
Aythami Morales
Gonzalo Mancera
Miguel Lopez
Ruben Tolosana
20
0
0
13 Jun 2025
pLSTM: parallelizable Linear Source Transition Mark networks
pLSTM: parallelizable Linear Source Transition Mark networks
Korbinian Poppel
Richard Freinschlag
Thomas Schmied
Wei Lin
Sepp Hochreiter
18
0
0
13 Jun 2025
Resolve Highway Conflict in Multi-Autonomous Vehicle Controls with Local State Attention
Resolve Highway Conflict in Multi-Autonomous Vehicle Controls with Local State Attention
Xuan Duy Ta
Bang Giang Le
Thanh Ha Le
Viet-Cuong Ta
15
0
0
13 Jun 2025
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis
Yuan Gao
Mattia Piccinini
Yuchen Zhang
Dingrui Wang
Korbinian Moller
...
Steven Peters
Andrea Stocco
Bassam Alrifaee
Marco Pavone
Johannes Betz
17
0
0
13 Jun 2025
RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer
RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer
Haotian Ni
Yake Wei
Hang Liu
Gong Chen
Chong Peng
Hao Lin
Di Hu
OffRL
66
0
0
13 Jun 2025
A Watermark for Auto-Regressive Image Generation Models
A Watermark for Auto-Regressive Image Generation Models
Yihan Wu
Xuehao Cui
Ruibo Chen
Georgios Milis
Heng Huang
WIGM
31
0
0
13 Jun 2025
Previous
12345...542543544
Next