Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 27,143 papers shown
Title
Vision Transformers in Precision Agriculture: A Comprehensive Survey
Saber Mehdipour
Seyed Abolghasem Mirroshandel
Seyed Amirhossein Tabatabaei
87
0
0
30 Apr 2025
Neuroevolution of Self-Attention Over Proto-Objects
Rafael C. Pinto
Anderson R. Tavares
OCL
459
0
0
30 Apr 2025
GPU Performance Portability needs Autotuning
Burkhard Ringlein
Thomas Parnell
Radu Stoica
449
0
0
30 Apr 2025
Investigating Zero-Shot Diagnostic Pathology in Vision-Language Models with Efficient Prompt Design
Vasudev Sharma
Ahmed Alagha
Abdelhakim Khellaf
Vincent Quoc-Huy Trinh
Mahdi S. Hosseini
141
0
0
30 Apr 2025
Revisiting Diffusion Autoencoder Training for Image Reconstruction Quality
Pramook Khungurn
Sukit Seripanitkarn
Phonphrm Thawatdamrongkit
Supasorn Suwajanakorn
DiffM
117
0
0
30 Apr 2025
Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions
Ziyi Dong
Chengxing Zhou
Weijian Deng
Pengxu Wei
Xiangyang Ji
Liang Lin
MQ
87
0
0
30 Apr 2025
Scalable Multi-Task Learning for Particle Collision Event Reconstruction with Heterogeneous Graph Neural Networks
William Sutcliffe
Marta Calvi
Simone Capelli
Jonas Eschle
J. G. Pardiñas
Abhijit Mathad
Azusa Uzuki
N. Serra
66
0
0
30 Apr 2025
RWKV-X: A Linear Complexity Hybrid Language Model
Haowen Hou
Zhiyi Huang
Kaifeng Tan
Rongchang Lu
Fei Richard Yu
VLM
172
1
0
30 Apr 2025
Fast2comm:Collaborative perception combined with prior knowledge
Zhengbin Zhang
Yan Wu
Hongkun Zhang
427
0
0
30 Apr 2025
Differentiable Room Acoustic Rendering with Multi-View Vision Priors
Derong Jin
Ruohan Gao
79
0
0
30 Apr 2025
Learning Universal User Representations Leveraging Cross-domain User Intent at Snapchat
Clark Mingxuan Ju
Leonardo Neves
Bhuvesh Kumar
Liam Collins
Tong Zhao
...
Rengim Ozturk
Yang Liu
Sen Yang
Manish Malik
Neil Shah
72
1
0
30 Apr 2025
Direct Motion Models for Assessing Generated Videos
Kelsey R. Allen
Carl Doersch
Guangyao Zhou
Mohammed Suhail
Danny Driess
...
Thomas Kipf
Mehdi S. M. Sajjadi
Kevin P. Murphy
João Carreira
Sjoerd van Steenkiste
EGVM
DiffM
VGen
154
0
0
30 Apr 2025
PAPN: Proximity Attention Encoder and Pointer Network Decoder for Parcel Pickup Route Prediction
Hansi Denis
Siegfried Mercelis
Ngoc-Quang Luong
28
0
0
30 Apr 2025
Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data
Maxime Bouthors
Josep Crego
François Yvon
RALM
LRM
80
0
0
30 Apr 2025
Multi-modal Transfer Learning for Dynamic Facial Emotion Recognition in the Wild
Ezra Engel
Lishan Li
Chris Hudy
Robert Schleusner
54
0
0
30 Apr 2025
A simple and effective approach for body part recognition on CT scans based on projection estimation
Franko Hrzic
Mohammadreza Movahhedi
Ophelie Lavoie-Gagne
Ata Kiapour
109
0
0
30 Apr 2025
LLM-based Interactive Imitation Learning for Robotic Manipulation
Jonas Werner
Kun-Mo Chu
C. Weber
S. Wermter
164
1
0
30 Apr 2025
Sparse-to-Sparse Training of Diffusion Models
Inês Cardoso Oliveira
Decebal Constantin Mocanu
Luis A. Leiva
DiffM
163
0
0
30 Apr 2025
AdSight: Scalable and Accurate Quantification of User Attention in Multi-Slot Sponsored Search
Mario Villaizán-Vallelado
Matteo Salvatori
Kayhan Latifzadeh
Antonio Penta
Luis A. Leiva
Ioannis Arapakis
171
0
0
30 Apr 2025
Polysemy of Synthetic Neurons Towards a New Type of Explanatory Categorical Vector Spaces
Michael Pichat
William Pogrund
Paloma Pichat
Judicael Poumay
Armanouche Gasparian
Samuel Demarchi
Martin Corbet
Alois Georgeon
Michael Veillet-Guillem
MILM
86
0
0
30 Apr 2025
Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction
Máté Gedeon
RALM
88
0
0
30 Apr 2025
DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation
Yinfeng Yu
Dongsheng Yang
94
0
0
30 Apr 2025
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training
Xinyi Liu
Yijiao Wang
Shenhan Zhu
Fangcheng Fu
Qingshuo Liu
Guangming Lin
Tengjiao Wang
GNN
296
0
0
30 Apr 2025
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
Qihao Liu
Ju He
Qihang Yu
Liang-Chieh Chen
Alan Yuille
DiffM
VGen
158
1
0
30 Apr 2025
Generative Machine Learning in Adaptive Control of Dynamic Manufacturing Processes: A Review
Suk Ki Lee
Hyunwoong Ko
AI4CE
101
0
0
30 Apr 2025
The Coral Protocol: Open Infrastructure Connecting The Internet of Agents
Roman J. Georgio
Caelum Forder
Suman Deb
Peter Carroll
Önder Gürcan
140
0
0
30 Apr 2025
AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models
Yinghui He
A. Panigrahi
Yong Lin
Sanjeev Arora
103
0
0
30 Apr 2025
Efficient LLMs with AMP: Attention Heads and MLP Pruning
Leandro Giusti Mugnaini
Bruno Yamamoto
Lucas Lauton de Alcantara
Victor Zacarias
Edson Bollis
Lucas Pellicer
A. H. R. Costa
Artur Jordao
86
1
0
29 Apr 2025
Frequency Feature Fusion Graph Network For Depression Diagnosis Via fNIRS
Chengkai Yang
Xingping Dong
Xiaofen Zong
73
0
0
29 Apr 2025
Unlocking User-oriented Pages: Intention-driven Black-box Scanner for Real-world Web Applications
Weizhe Wang
Yao Zhang
Kaitai Liang
Guangquan Xu
Hongpeng Bai
Qingyang Yan
Xi Zheng
Bin Wu
65
0
0
29 Apr 2025
Multimodal Large Language Models for Medicine: A Comprehensive Survey
Jiarui Ye
Hao Tang
LM&MA
178
0
0
29 Apr 2025
SFi-Former: Sparse Flow Induced Attention for Graph Transformer
Zechao Li
J. Q. Shi
Xinming Zhang
Miao Zhang
B. Li
118
0
0
29 Apr 2025
MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report Generation
Amaan Izhar
Nurul Japar
Norisma Idris
Ting Dang
MoE
112
0
0
29 Apr 2025
Evaluating Effects of Augmented SELFIES for Molecular Understanding Using QK-LSTM
Collin Beaudoin
Swaroop Ghosh
114
0
0
29 Apr 2025
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
Miguel Nogales
Matteo Gambella
Manuel Roveri
102
0
0
29 Apr 2025
ReCIT: Reconstructing Full Private Data from Gradient in Parameter-Efficient Fine-Tuning of Large Language Models
Jin Xie
Ruishi He
Songze Li
Xiaojun Jia
Shouling Ji
SILM
AAML
94
0
0
29 Apr 2025
Pretraining Large Brain Language Model for Active BCI: Silent Speech
Jinzhao Zhou
Zehong Cao
Yiqun Duan
Connor Barkley
Daniel Leong
...
Ziyi Zhao
T. Do
Yu-Cheng Chang
Sheng-Fu Liang
Chin-Teng Lin
104
1
0
29 Apr 2025
SteelBlastQC: Shot-blasted Steel Surface Dataset with Interpretable Detection of Surface Defects
Irina Ruzavina
Lisa Sophie Theis
Jesse Lemeer
Rutger de Groen
Leo Ebeling
Andrej Hulak
Jouaria Ali
Guangzhi Tang
Rico Mockel
95
0
0
29 Apr 2025
BrAIcht, a theatrical agent that speaks like Bertolt Brecht's characters
Baz Roland
Kristina Malyseva
Anna Pappa
Tristan Cazenave
115
0
0
29 Apr 2025
From Attention to Atoms: Spectral Dictionary Learning for Fast, Interpretable Language Models
Andrew Kiruluta
62
0
0
29 Apr 2025
SoccerDiffusion: Toward Learning End-to-End Humanoid Robot Soccer from Gameplay Recordings
Florian Vahl
Jörn Griepenburg
Jan Gutsche
Jasper Güldenstein
Jianwei Zhang
VGen
104
0
0
29 Apr 2025
TAMO:Fine-Grained Root Cause Analysis via Tool-Assisted LLM Agent with Multi-Modality Observation Data in Cloud-Native Systems
Qi. Wang
Xiao Zhang
Mingyi Li
Yuan Yuan
Mengbai Xiao
Fuzhen Zhuang
Dongxiao Yu
81
0
0
29 Apr 2025
MambaMoE: Mixture-of-Spectral-Spatial-Experts State Space Model for Hyperspectral Image Classification
Yichu Xu
Di Wang
Hongzan Jiao
Li Zhang
Lefei Zhang
Mamba
131
0
0
29 Apr 2025
FedMVP: Federated Multi-modal Visual Prompt Tuning for Vision-Language Models
Mainak Singha
Subhankar Roy
Sarthak Mehrotra
Ankit Jha
Moloud Abdar
Biplab Banerjee
Elisa Ricci
VLM
VPVLM
180
0
0
29 Apr 2025
Style-Adaptive Detection Transformer for Single-Source Domain Generalized Object Detection
Jianhong Han
Yupei Wang
Liang Chen
ViT
105
0
0
29 Apr 2025
Improving Phishing Email Detection Performance of Small Large Language Models
Zijie Lin
Zikang Liu
Hanbo Fan
199
0
0
29 Apr 2025
Enhancing Leakage Attacks on Searchable Symmetric Encryption Using LLM-Based Synthetic Data Generation
Joshua Chiu
Partha Protim Paul
Zahin Wahab
AAML
61
0
0
29 Apr 2025
CachePrune: Neural-Based Attribution Defense Against Indirect Prompt Injection Attacks
Rui Wang
Junda Wu
Yu Xia
Tong Yu
Ruiyi Zhang
Ryan Rossi
Lina Yao
Julian McAuley
AAML
SILM
81
0
0
29 Apr 2025
TriniMark: A Robust Generative Speech Watermarking Method for Trinity-Level Attribution
Yue Li
Wen Liu
Dongdong Lin
81
0
0
29 Apr 2025
JTreeformer: Graph-Transformer via Latent-Diffusion Model for Molecular Generation
J. Q. Shi
Chengxun Xie
Zhonghao Li
Xinming Zhang
Miao Zhang
MedIm
57
0
0
29 Apr 2025
Previous
1
2
3
...
37
38
39
...
541
542
543
Next