ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXivPDFHTML

Papers citing "Attention Is All You Need"

50 / 18,459 papers shown
Title
The Application of Deep Learning for Lymph Node Segmentation: A Systematic Review
The Application of Deep Learning for Lymph Node Segmentation: A Systematic Review
Jingguo Qu
Xinyang Han
Man-Lik Chui
Yao Pu
Simon Takadiyi Gunda
...
Jing Qin
Ann Dorothy King
Winnie Chiu-Wing Chu
J. Cai
Michael Tin-Cheung Ying
31
0
0
09 May 2025
Learn to Think: Bootstrapping LLM Reasoning Capability Through Graph Representation Learning
Learn to Think: Bootstrapping LLM Reasoning Capability Through Graph Representation Learning
Hang Gao
Chenhao Zhang
Tie Wang
Junsuo Zhao
Fengge Wu
Changwen Zheng
Huaping Liu
LRM
34
0
0
09 May 2025
Anymate: A Dataset and Baselines for Learning 3D Object Rigging
Anymate: A Dataset and Baselines for Learning 3D Object Rigging
Yufan Deng
Yuhao Zhang
Chen Geng
Shangzhe Wu
Jiajun Wu
3DH
57
0
0
09 May 2025
Achieving 3D Attention via Triplet Squeeze and Excitation Block
Achieving 3D Attention via Triplet Squeeze and Excitation Block
Maan Alhazmi
Abdulrahman Altahhan
30
0
0
09 May 2025
Register and CLS tokens yield a decoupling of local and global features in large ViTs
Register and CLS tokens yield a decoupling of local and global features in large ViTs
Alexander Lappe
M. Giese
24
0
0
09 May 2025
Efficient Fairness Testing in Large Language Models: Prioritizing Metamorphic Relations for Bias Detection
Efficient Fairness Testing in Large Language Models: Prioritizing Metamorphic Relations for Bias Detection
Suavis Giramata
Madhusudan Srinivasan
Venkat Naidu Gudivada
Upulee Kanewala
34
0
0
09 May 2025
Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition
Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition
Andrew Kiruluta
Eric Lundy
Priscilla Burity
29
0
0
09 May 2025
Generative Discovery of Partial Differential Equations by Learning from Math Handbooks
Generative Discovery of Partial Differential Equations by Learning from Math Handbooks
Hao Xu
Y. Chen
Rui Cao
Tianning Tang
Mengge Du
Jiacheng Li
Adrian H. Callaghan
Dongxiao Zhang
34
0
0
09 May 2025
CGTrack: Cascade Gating Network with Hierarchical Feature Aggregation for UAV Tracking
CGTrack: Cascade Gating Network with Hierarchical Feature Aggregation for UAV Tracking
Weihong Li
Xiaoqiong Liu
Heng Fan
L. Zhang
31
0
0
09 May 2025
Learning Sequential Kinematic Models from Demonstrations for Multi-Jointed Articulated Objects
Learning Sequential Kinematic Models from Demonstrations for Multi-Jointed Articulated Objects
Anmol Gupta
Weiwei Gu
Omkar Patil
Jun Ki Lee
N. Gopalan
29
0
0
09 May 2025
CellVerse: Do Large Language Models Really Understand Cell Biology?
CellVerse: Do Large Language Models Really Understand Cell Biology?
Fan Zhang
Tianyu Liu
Zhihong Zhu
Yu Wang
Haoyu Wang
Donghao Zhou
Yefeng Zheng
Kun Wang
X. Wu
Pheng-Ann Heng
ELM
41
0
0
09 May 2025
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Yiming Niu
Jinliang Deng
L. Zhang
Zimu Zhou
Yongxin Tong
AI4TS
33
0
0
09 May 2025
UniSymNet: A Unified Symbolic Network Guided by Transformer
UniSymNet: A Unified Symbolic Network Guided by Transformer
Xinxin Li
Juan Zhang
Da Li
Xingyu Liu
Jin Xu
Junping Yin
34
0
0
09 May 2025
Prompting Large Language Models for Training-Free Non-Intrusive Load Monitoring
Prompting Large Language Models for Training-Free Non-Intrusive Load Monitoring
Junyu Xue
Xudong Wang
Xiaoling He
Shicheng Liu
Yi Wang
Guoming Tang
24
0
0
09 May 2025
Improving Generalizability of Kolmogorov-Arnold Networks via Error-Correcting Output Codes
Improving Generalizability of Kolmogorov-Arnold Networks via Error-Correcting Output Codes
Youngjoon Lee
J. Gong
Joonhyuk Kang
26
0
0
09 May 2025
Physics-informed Temporal Difference Metric Learning for Robot Motion Planning
Physics-informed Temporal Difference Metric Learning for Robot Motion Planning
Ruiqi Ni
Zherong Pan
A. H. Qureshi
SSL
48
0
0
09 May 2025
A Simple Detector with Frame Dynamics is a Strong Tracker
A Simple Detector with Frame Dynamics is a Strong Tracker
Chenxu Peng
Changbo Wang
Minrui Zou
Danyang Li
Zhiyong Yang
Yimian Dai
Ming-Ming Cheng
Xiang Li
62
0
0
08 May 2025
Unpacking Robustness in Inflectional Languages: Adversarial Evaluation and Mechanistic Insights
Unpacking Robustness in Inflectional Languages: Adversarial Evaluation and Mechanistic Insights
Paweł Walkowiak
Marek Klonowski
Marcin Oleksy
Arkadiusz Janz
AAML
39
0
0
08 May 2025
Generative Models for Long Time Series: Approximately Equivariant Recurrent Network Structures for an Adjusted Training Scheme
Generative Models for Long Time Series: Approximately Equivariant Recurrent Network Structures for an Adjusted Training Scheme
Ruwen Fulek
Markus Lange-Hegermann
AI4TS
45
0
0
08 May 2025
InstanceGen: Image Generation with Instance-level Instructions
InstanceGen: Image Generation with Instance-level Instructions
Etai Sella
Yanir Kleiman
Hadar Averbuch-Elor
36
0
0
08 May 2025
Learning to Drive Anywhere with Model-Based Reannotation
Learning to Drive Anywhere with Model-Based Reannotation
Noriaki Hirose
Lydia Ignatova
Kyle Stachowicz
Catherine Glossop
Sergey Levine
Dhruv Shah
26
0
0
08 May 2025
Enhancing Satellite Object Localization with Dilated Convolutions and Attention-aided Spatial Pooling
Enhancing Satellite Object Localization with Dilated Convolutions and Attention-aided Spatial Pooling
S. A. Mostafa
Chenxi Wang
Jia Yue
Yuta Hozumi
Jianwu Wang
29
0
0
08 May 2025
X-Driver: Explainable Autonomous Driving with Vision-Language Models
X-Driver: Explainable Autonomous Driving with Vision-Language Models
Wei Liu
Jingyun Zhang
Binxiong Zheng
Yufeng Hu
Yingzhan Lin
Zengfeng Zeng
VLM
LRM
67
0
0
08 May 2025
FF-PNet: A Pyramid Network Based on Feature and Field for Brain Image Registration
FF-PNet: A Pyramid Network Based on Feature and Field for Brain Image Registration
Ying Zhang
Shuai Guo
Chenxi Sun
Yuchen Zhu
Jinhai Xiang
MedIm
47
0
0
08 May 2025
CCL: Collaborative Curriculum Learning for Sparse-Reward Multi-Agent Reinforcement Learning via Co-evolutionary Task Evolution
CCL: Collaborative Curriculum Learning for Sparse-Reward Multi-Agent Reinforcement Learning via Co-evolutionary Task Evolution
Yufei Lin
Chengwei Ye
Jun Wang
Kangsheng Wang
Linuo Xu
Shuyan Liu
Zeyu Zhang
40
1
0
08 May 2025
DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion
DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion
Qitao Zhao
Amy Lin
Jeff Tan
Jason Y. Zhang
Deva Ramanan
Shubham Tulsiani
VGen
51
0
0
08 May 2025
Multi-agent Embodied AI: Advances and Future Directions
Multi-agent Embodied AI: Advances and Future Directions
Zhaohan Feng
Ruiqi Xue
Lei Yuan
Yang Yu
Ning Ding
M. Liu
Bingzhao Gao
Jian Sun
Gang Wang
AI4CE
60
1
0
08 May 2025
Normalize Everything: A Preconditioned Magnitude-Preserving Architecture for Diffusion-Based Speech Enhancement
Normalize Everything: A Preconditioned Magnitude-Preserving Architecture for Diffusion-Based Speech Enhancement
Julius Richter
Danilo de Oliveira
Timo Gerkmann
DiffM
55
0
0
08 May 2025
Scalable LLM Math Reasoning Acceleration with Low-rank Distillation
Scalable LLM Math Reasoning Acceleration with Low-rank Distillation
Harry Dong
Bilge Acun
Beidi Chen
Yuejie Chi
LRM
34
0
0
08 May 2025
PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes
PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes
Ahmed Abdelreheem
Filippo Aleotti
Jamie Watson
Z. Qureshi
Abdelrahman Eldesokey
Peter Wonka
Gabriel J. Brostow
Sara Vicente
Guillermo Garcia-Hernando
DiffM
59
0
0
08 May 2025
Defending against Indirect Prompt Injection by Instruction Detection
Defending against Indirect Prompt Injection by Instruction Detection
Tongyu Wen
Chenglong Wang
Xiyuan Yang
Haoyu Tang
Yueqi Xie
Lingjuan Lyu
Zhicheng Dou
Fangzhao Wu
AAML
34
0
0
08 May 2025
The Evolution of Embedding Table Optimization and Multi-Epoch Training in Pinterest Ads Conversion
The Evolution of Embedding Table Optimization and Multi-Epoch Training in Pinterest Ads Conversion
Andrew Qiu
Shubham Barhate
Hin Wai Lui
Runze Su
Rafael Rios Müller
Kungang Li
Ling Leng
Han Sun
Shayan Ehsani
Zhifang Liu
36
0
0
08 May 2025
OWT: A Foundational Organ-Wise Tokenization Framework for Medical Imaging
OWT: A Foundational Organ-Wise Tokenization Framework for Medical Imaging
Sifan Song
Siyeop Yoon
Pengfei Jin
Sekeun Kim
Matthew Tivnan
...
Zhiliang Lyu
Dufan Wu
Ning Guo
Xiang Li
Quanzheng Li
OOD
ViT
64
0
0
08 May 2025
Privacy-Preserving Transformers: SwiftKey's Differential Privacy Implementation
Privacy-Preserving Transformers: SwiftKey's Differential Privacy Implementation
Abdelrahman Abouelenin
M. Abdelrehim
Raffy Fahim
Amr Hendy
Mohamed Afify
36
0
0
08 May 2025
Learning Item Representations Directly from Multimodal Features for Effective Recommendation
Learning Item Representations Directly from Multimodal Features for Effective Recommendation
Xin Zhou
Xiaoxiong Zhang
Dusit Niyato
Zhiqi Shen
61
0
0
08 May 2025
Nonlinear Motion-Guided and Spatio-Temporal Aware Network for Unsupervised Event-Based Optical Flow
Nonlinear Motion-Guided and Spatio-Temporal Aware Network for Unsupervised Event-Based Optical Flow
Zuntao Liu
Hao Zhuang
Junjie Jiang
Yuhang Song
Zheng Fang
50
0
0
08 May 2025
VaCDA: Variational Contrastive Alignment-based Scalable Human Activity Recognition
VaCDA: Variational Contrastive Alignment-based Scalable Human Activity Recognition
Soham Khisa
Avijoy Chakma
51
0
0
08 May 2025
GroverGPT-2: Simulating Grover's Algorithm via Chain-of-Thought Reasoning and Quantum-Native Tokenization
GroverGPT-2: Simulating Grover's Algorithm via Chain-of-Thought Reasoning and Quantum-Native Tokenization
Min Chen
Jinglei Cheng
Pingzhi Li
Haoran Wang
Tianlong Chen
Junyu Liu
LRM
51
0
0
08 May 2025
ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model
ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model
Sagnik Bhattacharya
Abhiram Gorle
Ahmed Mohsin
Ahsan Bilal
Connor Ding
Amit Kumar Singh Yadav
Tsachy Weissman
DiffM
47
0
0
08 May 2025
Diffusion Model Quantization: A Review
Diffusion Model Quantization: A Review
Qian Zeng
Chenggong Hu
Mingli Song
Jie Song
MQ
48
0
0
08 May 2025
Trading Under Uncertainty: A Distribution-Based Strategy for Futures Markets Using FutureQuant Transformer
Trading Under Uncertainty: A Distribution-Based Strategy for Futures Markets Using FutureQuant Transformer
Wenhao Guo
Yuda Wang
Zeqiao Huang
Changjiang Zhang
Shumin ma
AIFin
29
0
0
08 May 2025
The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction
The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction
Tom Sander
Moritz Tenthoff
Kay Wohlfarth
Christian Wöhler
31
0
0
08 May 2025
T-T: Table Transformer for Tagging-based Aspect Sentiment Triplet Extraction
T-T: Table Transformer for Tagging-based Aspect Sentiment Triplet Extraction
Kun Peng
Chaodong Tong
Cong Cao
Hao Peng
Yue Liu
Guanlin Wu
Lei Jiang
Yanbing Liu
Philip S. Yu
LMTD
50
0
0
08 May 2025
Rethinking Invariance in In-context Learning
Rethinking Invariance in In-context Learning
Lizhe Fang
Yifei Wang
Khashayar Gatmiry
Lei Fang
Yishuo Wang
56
3
0
08 May 2025
M2Rec: Multi-scale Mamba for Efficient Sequential Recommendation
M2Rec: Multi-scale Mamba for Efficient Sequential Recommendation
Qianru Zhang
Liang Qu
Honggang Wen
Dong Huang
Siu-Ming Yiu
Nguyen Quoc Viet Hung
Hongzhi Yin
Mamba
34
0
0
07 May 2025
Retrieval Augmented Time Series Forecasting
Retrieval Augmented Time Series Forecasting
Sungwon Han
Seungeon Lee
M. Cha
Sercan Ö. Arik
Jinsung Yoon
AI4TS
38
0
0
07 May 2025
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency
Qingyuan Wang
Guoxin Wang
B. Cardiff
Deepu John
38
0
0
07 May 2025
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via $α$-$β$-Divergence
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via ααα-βββ-Divergence
Guanghui Wang
Zhiyong Yang
Zihan Wang
Shi Wang
Qianqian Xu
Qingming Huang
42
0
0
07 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
38
0
0
07 May 2025
Fine-Tuning Large Language Models and Evaluating Retrieval Methods for Improved Question Answering on Building Codes
Fine-Tuning Large Language Models and Evaluating Retrieval Methods for Improved Question Answering on Building Codes
Mohammad Aqib
Mohd Hamza
Qipei Mei
Ying Hei Chui
RALM
ELM
52
0
0
07 May 2025
Previous
123456...368369370
Next