Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 27,180 papers shown
Title
Theoretical Guarantees for LT-TTD: A Unified Transformer-based Architecture for Two-Level Ranking Systems
Ayoub Abraich
47
0
0
07 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
102
0
0
07 May 2025
BuildingBlock: A Hybrid Approach for Structured Building Generation
Junming Huang
Chi-Yin Wang
Letian Li
Changxin Huang
Qiang Dai
W. Xu
130
0
0
07 May 2025
Retrieval Augmented Time Series Forecasting
Sungwon Han
Seungeon Lee
M. Cha
Sercan O. Arik
Jinsung Yoon
AI4TS
65
0
0
07 May 2025
UniCO: Towards a Unified Model for Combinatorial Optimization Problems
Zefang Zong
Xiaochen Wei
Guozhen Zhang
Chen Gao
Huandong Wang
Yong Li
57
0
0
07 May 2025
DiffPattern-Flex: Efficient Layout Pattern Generation via Discrete Diffusion
Zixiao Wang
Wenqian Zhao
Yunheng Shen
Yang Bai
Guojin Chen
Farzan Farnia
Bei Yu
91
0
0
07 May 2025
Adaptive and Robust DBSCAN with Multi-agent Reinforcement Learning
Hao Peng
Xiang Huang
Shuo Sun
Ruitong Zhang
Philip S. Yu
80
0
0
07 May 2025
Overcoming Data Scarcity in Generative Language Modelling for Low-Resource Languages: A Systematic Review
Josh McGiff
Nikola S. Nikolov
122
1
0
07 May 2025
Retrieval Augmented Generation Evaluation for Health Documents
Mario Ceresa
Lorenzo Bertolini
Valentin Comte
Nicholas Spadaro
Barbara Raffael
...
Sergio Consoli
Amalia Muñoz Piñeiro
Alex Patak
Maddalena Querci
Tobias Wiesenthal
RALM
3DV
98
0
1
07 May 2025
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes
S. Linok
Vadim Semenov
Anastasia Trunova
Oleg Bulichev
Dmitry A. Yudin
114
0
0
06 May 2025
Recall with Reasoning: Chain-of-Thought Distillation for Mamba's Long-Context Memory and Extrapolation
Junyu Ma
Tianqing Fang
Zizhuo Zhang
Hongming Zhang
Haitao Mi
Dong Yu
ReLM
RALM
LRM
489
1
0
06 May 2025
SD-VSum: A Method and Dataset for Script-Driven Video Summarization
Manolis Mylonas
Evlampios Apostolidis
Vasileios Mezaris
91
0
0
06 May 2025
CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting
Huawei Sun
Bora Kunter Sahin
Georg Stettinger
Maximilian Bernhard
Matthias Schubert
Robert Wille
145
0
0
06 May 2025
Plug-and-Play AMC: Context Is King in Training-Free, Open-Set Modulation with LLMs
Mohammad Rostami
Atik Faysal
Reihaneh Gh. Roshan
Huaxia Wang
Nikhil Muralidhar
Yu-dong Yao
72
0
0
06 May 2025
Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning
Caleb Chuck
Fan Feng
Carl Qi
Chang Shi
Siddhant Agarwal
Amy Zhang
S. Niekum
86
0
0
06 May 2025
seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models
Hafez Ghaemi
Eilif Muller
Shahab Bakhtiari
158
0
0
06 May 2025
Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights
Zhaiming Shen
Alex Havrilla
Rongjie Lai
A. Cloninger
Wenjing Liao
99
1
0
06 May 2025
Improving Failure Prediction in Aircraft Fastener Assembly Using Synthetic Data in Imbalanced Datasets
G. J. G. Lahr
Ricardo V. Godoy
Thiago H. Segreto
Jose O. Savazzi
Arash Ajoudani
Thiago Boaventura
G. Caurin
AI4CE
50
0
0
06 May 2025
Blending 3D Geometry and Machine Learning for Multi-View Stereopsis
Vibhas Kumar Vats
Md. Alimoor Reza
David J. Crandall
Soon-Heung Jung
3DV
84
0
0
06 May 2025
Faster MoE LLM Inference for Extremely Large Models
Haoqi Yang
Luohe Shi
Qiwei Li
Zuchao Li
Ping Wang
Bo Du
Mengjia Shen
Hai Zhao
MoE
133
1
0
06 May 2025
Mamba-Diffusion Model with Learnable Wavelet for Controllable Symbolic Music Generation
Jincheng Zhang
Gyorgy Fazekas
C. Saitis
84
0
0
06 May 2025
Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges
Hao Xu
Arbind Agrahari Baniya
Sam Well
Mohamed Reda Bouadjenek
Richard Dazeley
S. Aryal
AI4TS
56
0
0
06 May 2025
TimeTracker: Event-based Continuous Point Tracking for Video Frame Interpolation with Non-linear Motion
Haoyue Liu
Jinghan Xu
Yi Chang
Hanyu Zhou
Haozhi Zhao
Lin Wang
Luxin Yan
66
0
0
06 May 2025
Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation
Yi Lin
Dong Zhang
X. B. Fang
Yufan Chen
K.-T. Cheng
Hao Chen
55
0
0
06 May 2025
Sentence Embeddings as an intermediate target in end-to-end summarisation
Maciej Zembrzuski
Saad Mahamood
67
0
0
06 May 2025
Physics-inspired Energy Transition Neural Network for Sequence Learning
Zhou Wu
Junyi An
Baile Xu
Furao Shen
Jian Zhao
PINN
65
0
0
06 May 2025
Fixed-Length Dense Fingerprint Representation
Zhiyu Pan
Xiongjun Guan
Yongjie Duan
Jianjiang Feng
Jie Zhou
44
0
0
06 May 2025
Mitigating Image Captioning Hallucinations in Vision-Language Models
Fei Zhao
Chenyi Zhang
Runlin Zhang
Tianyang Wang
Xi Li
VLM
150
0
0
06 May 2025
Latent Adaptive Planner for Dynamic Manipulation
Donghun Noh
Deqian Kong
Minglu Zhao
Andrew Lizarraga
Jianwen Xie
Ying Nian Wu
Dennis W. Hong
407
1
0
06 May 2025
Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents
Schaun Wheeler
Olivier Jeunen
LLMAG
70
2
0
06 May 2025
Enhancing Target-unspecific Tasks through a Features Matrix
Fangming Cui
Yonggang Zhang
Xuan Wang
Xinmei Tian
Jun Yu
AAML
118
1
0
06 May 2025
Rainbow Delay Compensation: A Multi-Agent Reinforcement Learning Framework for Mitigating Delayed Observation
Songchen Fu
Siang Chen
Shaojing Zhao
Letian Bai
Ta Li
Yonghong Yan
181
0
0
06 May 2025
Geospatial Mechanistic Interpretability of Large Language Models
Stef De Sabbata
Stefano Mizzaro
Kevin Roitero
AI4CE
129
0
0
06 May 2025
Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering
Joshua Owotogbe
LLMAG
114
0
0
06 May 2025
A Unit Enhancement and Guidance Framework for Audio-Driven Avatar Video Generation
Y.B. Wang
S.Z. Zhou
J.F. Wu
T. Hu
J.N. Zhang
DiffM
VGen
128
0
0
06 May 2025
Robust Understanding of Human-Robot Social Interactions through Multimodal Distillation
Tongfei Bian
Mathieu Chollet
T. Guha
86
0
0
06 May 2025
Prediction-powered estimators for finite population statistics in highly imbalanced textual data: Public hate crime estimation
Hannes Waldetoft
Jakob Torgander
Måns Magnusson
59
1
0
05 May 2025
Data Augmentation With Back translation for Low Resource languages: A case of English and Luganda
Richard Kimera
DongNyeong Heo
Daniela N. Rim
Heeyoul Choi
449
0
0
05 May 2025
DPNet: Dynamic Pooling Network for Tiny Object Detection
Luqi Gong
Haotian Chen
Yushen Chen
Tianliang Yao
Chao Li
Shuai Zhao
Guangjie Han
ObjD
444
0
0
05 May 2025
LLM4FTS: Enhancing Large Language Models for Financial Time Series Prediction
Zian Liu
Renjun Jia
AI4TS
AIFin
80
1
0
05 May 2025
DELTA: Dense Depth from Events and LiDAR using Transformer's Attention
Vincent Brebion
Julien Moreau
Franck Davoine
137
0
0
05 May 2025
Database-Agnostic Gait Enrollment using SetTransformers
Nicoleta Basoc
Adrian Cosma
Andy Catruna
Emilian Radoi
SLR
92
0
0
05 May 2025
Advancing Constrained Monotonic Neural Networks: Achieving Universal Approximation Beyond Bounded Activations
Davide Sartor
Alberto Sinigaglia
Gian Antonio Susto
204
0
0
05 May 2025
MSFNet-CPD: Multi-Scale Cross-Modal Fusion Network for Crop Pest Detection
Jiaqi Zhang
Zhuodong Liu
Kejian Yu
79
0
0
05 May 2025
Large Language Model Partitioning for Low-Latency Inference at the Edge
Dimitrios Kafetzis
Ramin Khalili
Iordanis Koutsopoulos
69
0
0
05 May 2025
Variational diffusion transformers for conditional sampling of supernovae spectra
Yunyi Shen
Alexander T. Gagliano
DiffM
37
0
0
05 May 2025
Rethinking Multimodal Sentiment Analysis: A High-Accuracy, Simplified Fusion Architecture
Nischal Mandal
Yang Li
53
0
0
05 May 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
76
2
0
05 May 2025
Sharpness-Aware Minimization with Z-Score Gradient Filtering for Neural Networks
Juyoung Yun
201
0
0
05 May 2025
A Theoretical Analysis of Compositional Generalization in Neural Networks: A Necessary and Sufficient Condition
Yuanpeng Li
CoGe
445
0
0
05 May 2025
Previous
1
2
3
...
34
35
36
...
542
543
544
Next