v1v2v3v4v5v6v7 (latest)

Attention Is All You Need

12 June 2017

Papers citing "Attention Is All You Need"

50 / 27,143 papers shown

Title
Vision Transformers in Precision Agriculture: A Comprehensive Survey Saber Mehdipour Seyed Abolghasem Mirroshandel Seyed Amirhossein Tabatabaei 87 0 0 30 Apr 2025
Neuroevolution of Self-Attention Over Proto-Objects Rafael C. Pinto Anderson R. Tavares OCL 459 0 0 30 Apr 2025
GPU Performance Portability needs Autotuning Burkhard Ringlein Thomas Parnell Radu Stoica 449 0 0 30 Apr 2025
Investigating Zero-Shot Diagnostic Pathology in Vision-Language Models with Efficient Prompt Design Vasudev Sharma Ahmed Alagha Abdelhakim Khellaf Vincent Quoc-Huy Trinh Mahdi S. Hosseini 141 0 0 30 Apr 2025
Revisiting Diffusion Autoencoder Training for Image Reconstruction Quality Pramook Khungurn Sukit Seripanitkarn Phonphrm Thawatdamrongkit Supasorn Suwajanakorn DiffM 117 0 0 30 Apr 2025
Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions Ziyi Dong Chengxing Zhou Weijian Deng Pengxu Wei Xiangyang Ji Liang Lin MQ 87 0 0 30 Apr 2025
Scalable Multi-Task Learning for Particle Collision Event Reconstruction with Heterogeneous Graph Neural Networks William Sutcliffe Marta Calvi Simone Capelli Jonas Eschle J. G. Pardiñas Abhijit Mathad Azusa Uzuki N. Serra 66 0 0 30 Apr 2025
RWKV-X: A Linear Complexity Hybrid Language Model Haowen Hou Zhiyi Huang Kaifeng Tan Rongchang Lu Fei Richard Yu VLM 172 1 0 30 Apr 2025
Fast2comm:Collaborative perception combined with prior knowledge Zhengbin Zhang Yan Wu Hongkun Zhang 427 0 0 30 Apr 2025
Differentiable Room Acoustic Rendering with Multi-View Vision Priors Derong Jin Ruohan Gao 79 0 0 30 Apr 2025
Learning Universal User Representations Leveraging Cross-domain User Intent at Snapchat Clark Mingxuan Ju Leonardo Neves Bhuvesh Kumar Liam Collins Tong Zhao ... Rengim Ozturk Yang Liu Sen Yang Manish Malik Neil Shah 72 1 0 30 Apr 2025
Direct Motion Models for Assessing Generated Videos Kelsey R. Allen Carl Doersch Guangyao Zhou Mohammed Suhail Danny Driess ... Thomas Kipf Mehdi S. M. Sajjadi Kevin P. Murphy João Carreira Sjoerd van Steenkiste EGVM DiffM VGen 154 0 0 30 Apr 2025
PAPN: Proximity Attention Encoder and Pointer Network Decoder for Parcel Pickup Route Prediction Hansi Denis Siegfried Mercelis Ngoc-Quang Luong 28 0 0 30 Apr 2025
Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data Maxime Bouthors Josep Crego François Yvon RALM LRM 80 0 0 30 Apr 2025
Multi-modal Transfer Learning for Dynamic Facial Emotion Recognition in the Wild Ezra Engel Lishan Li Chris Hudy Robert Schleusner 54 0 0 30 Apr 2025
A simple and effective approach for body part recognition on CT scans based on projection estimation Franko Hrzic Mohammadreza Movahhedi Ophelie Lavoie-Gagne Ata Kiapour 109 0 0 30 Apr 2025
LLM-based Interactive Imitation Learning for Robotic Manipulation Jonas Werner Kun-Mo Chu C. Weber S. Wermter 164 1 0 30 Apr 2025
Sparse-to-Sparse Training of Diffusion Models Inês Cardoso Oliveira Decebal Constantin Mocanu Luis A. Leiva DiffM 163 0 0 30 Apr 2025
AdSight: Scalable and Accurate Quantification of User Attention in Multi-Slot Sponsored Search Mario Villaizán-Vallelado Matteo Salvatori Kayhan Latifzadeh Antonio Penta Luis A. Leiva Ioannis Arapakis 171 0 0 30 Apr 2025
Polysemy of Synthetic Neurons Towards a New Type of Explanatory Categorical Vector Spaces Michael Pichat William Pogrund Paloma Pichat Judicael Poumay Armanouche Gasparian Samuel Demarchi Martin Corbet Alois Georgeon Michael Veillet-Guillem MILM 86 0 0 30 Apr 2025
Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction Máté Gedeon RALM 88 0 0 30 Apr 2025
DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation Yinfeng Yu Dongsheng Yang 94 0 0 30 Apr 2025
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training Xinyi Liu Yijiao Wang Shenhan Zhu Fangcheng Fu Qingshuo Liu Guangming Lin Tengjiao Wang GNN 296 0 0 30 Apr 2025
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction Qihao Liu Ju He Qihang Yu Liang-Chieh Chen Alan Yuille DiffM VGen 158 1 0 30 Apr 2025
Generative Machine Learning in Adaptive Control of Dynamic Manufacturing Processes: A Review Suk Ki Lee Hyunwoong Ko AI4CE 101 0 0 30 Apr 2025
The Coral Protocol: Open Infrastructure Connecting The Internet of Agents Roman J. Georgio Caelum Forder Suman Deb Peter Carroll Önder Gürcan 140 0 0 30 Apr 2025
AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models Yinghui He A. Panigrahi Yong Lin Sanjeev Arora 103 0 0 30 Apr 2025
Efficient LLMs with AMP: Attention Heads and MLP Pruning Leandro Giusti Mugnaini Bruno Yamamoto Lucas Lauton de Alcantara Victor Zacarias Edson Bollis Lucas Pellicer A. H. R. Costa Artur Jordao 86 1 0 29 Apr 2025
Frequency Feature Fusion Graph Network For Depression Diagnosis Via fNIRS Chengkai Yang Xingping Dong Xiaofen Zong 73 0 0 29 Apr 2025
Unlocking User-oriented Pages: Intention-driven Black-box Scanner for Real-world Web Applications Weizhe Wang Yao Zhang Kaitai Liang Guangquan Xu Hongpeng Bai Qingyang Yan Xi Zheng Bin Wu 65 0 0 29 Apr 2025
Multimodal Large Language Models for Medicine: A Comprehensive Survey Jiarui Ye Hao Tang LM&MA 178 0 0 29 Apr 2025
SFi-Former: Sparse Flow Induced Attention for Graph Transformer Zechao Li J. Q. Shi Xinming Zhang Miao Zhang B. Li 118 0 0 29 Apr 2025
MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report Generation Amaan Izhar Nurul Japar Norisma Idris Ting Dang MoE 112 0 0 29 Apr 2025
Evaluating Effects of Augmented SELFIES for Molecular Understanding Using QK-LSTM Collin Beaudoin Swaroop Ghosh 114 0 0 29 Apr 2025
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures Miguel Nogales Matteo Gambella Manuel Roveri 102 0 0 29 Apr 2025
ReCIT: Reconstructing Full Private Data from Gradient in Parameter-Efficient Fine-Tuning of Large Language Models Jin Xie Ruishi He Songze Li Xiaojun Jia Shouling Ji SILM AAML 94 0 0 29 Apr 2025
Pretraining Large Brain Language Model for Active BCI: Silent Speech Jinzhao Zhou Zehong Cao Yiqun Duan Connor Barkley Daniel Leong ... Ziyi Zhao T. Do Yu-Cheng Chang Sheng-Fu Liang Chin-Teng Lin 104 1 0 29 Apr 2025
SteelBlastQC: Shot-blasted Steel Surface Dataset with Interpretable Detection of Surface Defects Irina Ruzavina Lisa Sophie Theis Jesse Lemeer Rutger de Groen Leo Ebeling Andrej Hulak Jouaria Ali Guangzhi Tang Rico Mockel 95 0 0 29 Apr 2025
BrAIcht, a theatrical agent that speaks like Bertolt Brecht's characters Baz Roland Kristina Malyseva Anna Pappa Tristan Cazenave 115 0 0 29 Apr 2025
From Attention to Atoms: Spectral Dictionary Learning for Fast, Interpretable Language Models Andrew Kiruluta 62 0 0 29 Apr 2025
SoccerDiffusion: Toward Learning End-to-End Humanoid Robot Soccer from Gameplay Recordings Florian Vahl Jörn Griepenburg Jan Gutsche Jasper Güldenstein Jianwei Zhang VGen 104 0 0 29 Apr 2025
TAMO:Fine-Grained Root Cause Analysis via Tool-Assisted LLM Agent with Multi-Modality Observation Data in Cloud-Native Systems Qi. Wang Xiao Zhang Mingyi Li Yuan Yuan Mengbai Xiao Fuzhen Zhuang Dongxiao Yu 81 0 0 29 Apr 2025
MambaMoE: Mixture-of-Spectral-Spatial-Experts State Space Model for Hyperspectral Image Classification Yichu Xu Di Wang Hongzan Jiao Li Zhang Lefei Zhang Mamba 131 0 0 29 Apr 2025
FedMVP: Federated Multi-modal Visual Prompt Tuning for Vision-Language Models Mainak Singha Subhankar Roy Sarthak Mehrotra Ankit Jha Moloud Abdar Biplab Banerjee Elisa Ricci VLM VPVLM 180 0 0 29 Apr 2025
Style-Adaptive Detection Transformer for Single-Source Domain Generalized Object Detection Jianhong Han Yupei Wang Liang Chen ViT 105 0 0 29 Apr 2025
Improving Phishing Email Detection Performance of Small Large Language Models Zijie Lin Zikang Liu Hanbo Fan 199 0 0 29 Apr 2025
Enhancing Leakage Attacks on Searchable Symmetric Encryption Using LLM-Based Synthetic Data Generation Joshua Chiu Partha Protim Paul Zahin Wahab AAML 61 0 0 29 Apr 2025
CachePrune: Neural-Based Attribution Defense Against Indirect Prompt Injection Attacks Rui Wang Junda Wu Yu Xia Tong Yu Ruiyi Zhang Ryan Rossi Lina Yao Julian McAuley AAML SILM 81 0 0 29 Apr 2025
TriniMark: A Robust Generative Speech Watermarking Method for Trinity-Level Attribution Yue Li Wen Liu Dongdong Lin 81 0 0 29 Apr 2025
JTreeformer: Graph-Transformer via Latent-Diffusion Model for Molecular Generation J. Q. Shi Chengxun Xie Zhonghao Li Xinming Zhang Miao Zhang MedIm 57 0 0 29 Apr 2025