Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 18,458 papers shown
Title
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
38
0
0
07 May 2025
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency
Qingyuan Wang
Guoxin Wang
B. Cardiff
Deepu John
38
0
0
07 May 2025
Adaptive and Robust DBSCAN with Multi-agent Reinforcement Learning
Hao Peng
Xiang Huang
Shuo Sun
Ruitong Zhang
Philip S. Yu
48
0
0
07 May 2025
UniCO: Towards a Unified Model for Combinatorial Optimization Problems
Zefang Zong
Xiaochen Wei
Guozhen Zhang
Chen Gao
Huandong Wang
Yong Li
39
0
0
07 May 2025
Retrieval Augmented Generation Evaluation for Health Documents
Mario Ceresa
Lorenzo Bertolini
Valentin Comte
Nicholas Spadaro
Barbara Raffael
...
Sergio Consoli
Amalia Muñoz Piñeiro
Alex Patak
Maddalena Querci
Tobias Wiesenthal
RALM
3DV
39
0
1
07 May 2025
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Chetan Pathade
AAML
SILM
59
1
0
07 May 2025
Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective
Songsong Duan
Xi Yang
Nannan Wang
Xinbo Gao
55
0
0
07 May 2025
In-Context Adaptation to Concept Drift for Learned Database Operations
Jiaqi Zhu
Shaofeng Cai
Yanyan Shen
Gang Chen
Fang Deng
Beng Chin Ooi
VLM
52
0
0
07 May 2025
AS3D: 2D-Assisted Cross-Modal Understanding with Semantic-Spatial Scene Graphs for 3D Visual Grounding
Feng Xiao
Hongbin Xu
Guocan Zhao
Wenxiong Kang
53
0
0
07 May 2025
Multi-Granular Attention based Heterogeneous Hypergraph Neural Network
Hong Jin
Kaicheng Zhou
Jie Yin
Lan You
Zhifeng Zhou
41
0
0
07 May 2025
CountDiffusion: Text-to-Image Synthesis with Training-Free Counting-Guidance Diffusion
Yong Li
Pencheng Wan
Liang Han
Yaowei Wang
Liqiang Nie
Min Zhang
43
0
0
07 May 2025
FRAIN to Train: A Fast-and-Reliable Solution for Decentralized Federated Learning
Sanghyeon Park
Soo-Mook Moon
47
0
0
07 May 2025
Theoretical Guarantees for LT-TTD: A Unified Transformer-based Architecture for Two-Level Ranking Systems
Ayoub Abraich
45
0
0
07 May 2025
DiffPattern-Flex: Efficient Layout Pattern Generation via Discrete Diffusion
Zixiao Wang
Wenqian Zhao
Yunheng Shen
Yang Bai
Guojin Chen
Farzan Farnia
Bei Yu
33
0
0
07 May 2025
PAHA: Parts-Aware Audio-Driven Human Animation with Diffusion Model
Y.B. Wang
S.Z. Zhou
J.F. Wu
T. Hu
J.N. Zhang
Zerui Li
Y. Liu
DiffM
VGen
69
0
0
06 May 2025
Physics-inspired Energy Transition Neural Network for Sequence Learning
Zhou Wu
Junyi An
Baile Xu
Furao Shen
Jian Zhao
PINN
27
0
0
06 May 2025
Enhancing Target-unspecific Tasks through a Features Matrix
Fangming Cui
Yonggang Zhang
Xuan Wang
Xinmei Tian
Jun Yu
AAML
50
0
0
06 May 2025
Rainbow Delay Compensation: A Multi-Agent Reinforcement Learning Framework for Mitigating Delayed Observation
Songchen Fu
Siang Chen
Shaojing Zhao
Letian Bai
Ta Li
Yonghong Yan
32
0
0
06 May 2025
CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting
Huawei Sun
Bora Kunter Sahin
Georg Stettinger
Maximilian Bernhard
Matthias Schubert
Robert Wille
49
0
0
06 May 2025
seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models
Hafez Ghaemi
Eilif Muller
Shahab Bakhtiari
54
0
0
06 May 2025
Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights
Zhaiming Shen
Alex Havrilla
Rongjie Lai
A. Cloninger
Wenjing Liao
39
0
0
06 May 2025
Geospatial Mechanistic Interpretability of Large Language Models
Stef De Sabbata
Stefano Mizzaro
Kevin Roitero
AI4CE
37
0
0
06 May 2025
Mitigating Image Captioning Hallucinations in Vision-Language Models
Fei Zhao
Chenyi Zhang
Runlin Zhang
Tianyang Wang
Xi Li
VLM
44
0
0
06 May 2025
Sentence Embeddings as an intermediate target in end-to-end summarisation
Maciej Zembrzuski
Saad Mahamood
47
0
0
06 May 2025
Robust Understanding of Human-Robot Social Interactions through Multimodal Distillation
Tongfei Bian
Mathieu Chollet
T. Guha
31
0
0
06 May 2025
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes
S. Linok
Vadim Semenov
Anastasia Trunova
Oleg Bulichev
Dmitry A. Yudin
52
0
0
06 May 2025
Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges
Hao Xu
Arbind Agrahari Baniya
Sam Well
Mohamed Reda Bouadjenek
Richard Dazeley
S. Aryal
AI4TS
29
0
0
06 May 2025
Latent Adaptive Planner for Dynamic Manipulation
Donghun Noh
Deqian Kong
Minglu Zhao
Andrew Lizarraga
Jianwen Xie
Ying Nian Wu
Dennis W. Hong
190
0
0
06 May 2025
Faster MoE LLM Inference for Extremely Large Models
Haoqi Yang
Luohe Shi
Qiwei Li
Zuchao Li
Ping Wang
Bo Du
Mengjia Shen
Hai Zhao
MoE
68
0
0
06 May 2025
Improving Failure Prediction in Aircraft Fastener Assembly Using Synthetic Data in Imbalanced Datasets
G. J. G. Lahr
Ricardo V. Godoy
Thiago H. Segreto
Jose O. Savazzi
Arash Ajoudani
Thiago Boaventura
G. Caurin
AI4CE
24
0
0
06 May 2025
Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering
Joshua Owotogbe
LLMAG
62
0
0
06 May 2025
Mamba-Diffusion Model with Learnable Wavelet for Controllable Symbolic Music Generation
Jincheng Zhang
Gyorgy Fazekas
C. Saitis
53
0
0
06 May 2025
Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning
Caleb Chuck
Fan Feng
Carl Qi
Chang Shi
Siddhant Agarwal
Amy Zhang
S. Niekum
47
0
0
06 May 2025
Recall with Reasoning: Chain-of-Thought Distillation for Mamba's Long-Context Memory and Extrapolation
Junyu Ma
Tianqing Fang
Zizhuo Zhang
Hongming Zhang
Haitao Mi
Dong Yu
ReLM
RALM
LRM
204
0
0
06 May 2025
Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation
Yi-Mou Lin
Dong-Ming Zhang
X. B. Fang
Yufan Chen
K.-T. Cheng
Hao Chen
33
0
0
06 May 2025
Bielik 11B v2 Technical Report
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
34
0
0
05 May 2025
A Theoretical Analysis of Compositional Generalization in Neural Networks: A Necessary and Sufficient Condition
Yuanpeng Li
CoGe
206
0
0
05 May 2025
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference
Yushen Chen
Jiawei Zhang
Baotong Lu
Qianxi Zhang
Chengruidong Zhang
...
Chen Chen
Mingxing Zhang
Yuqing Yang
Fan Yang
Mao Yang
38
0
0
05 May 2025
Database-Agnostic Gait Enrollment using SetTransformers
Nicoleta Basoc
Adrian Cosma
Andy Catruna
Emilian Radoi
SLR
36
0
0
05 May 2025
SCFormer: Structured Channel-wise Transformer with Cumulative Historical State for Multivariate Time Series Forecasting
Shiwei Guo
Z. Chen
Yupeng Ma
Yunfei Han
Yi Wang
AI4TS
205
0
0
05 May 2025
Advancing Constrained Monotonic Neural Networks: Achieving Universal Approximation Beyond Bounded Activations
Davide Sartor
Alberto Sinigaglia
Gian Antonio Susto
37
0
0
05 May 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
41
1
0
05 May 2025
Large Language Model Partitioning for Low-Latency Inference at the Edge
Dimitrios Kafetzis
Ramin Khalili
Iordanis Koutsopoulos
29
0
0
05 May 2025
DELTA: Dense Depth from Events and LiDAR using Transformer's Attention
Vincent Brebion
Julien Moreau
Franck Davoine
45
0
0
05 May 2025
DPNet: Dynamic Pooling Network for Tiny Object Detection
Luqi Gong
Haotian Chen
Yushen Chen
Tianliang Yao
Chao Li
Shuai Zhao
Guangjie Han
ObjD
203
0
0
05 May 2025
EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning
Lingxiao Kong
Cong Yang
Susanne Neufang
Oya Beyan
Zeyd Boukhers
OffRL
39
0
0
05 May 2025
Data Augmentation With Back translation for Low Resource languages: A case of English and Luganda
Richard Kimera
DongNyeong Heo
Daniela N. Rim
Heeyoul Choi
179
0
0
05 May 2025
Prediction-powered estimators for finite population statistics in highly imbalanced textual data: Public hate crime estimation
Hannes Waldetoft
Jakob Torgander
Måns Magnusson
34
0
0
05 May 2025
Bielik v3 Small: Technical Report
Krzysztof Ociepa
Łukasz Flis
Remigiusz Kinas
Krzysztof Wróbel
Adrian Gwoździej
29
0
0
05 May 2025
Sharpness-Aware Minimization with Z-Score Gradient Filtering for Neural Networks
Juyoung Yun
40
0
0
05 May 2025
Previous
1
2
3
...
5
6
7
...
368
369
370
Next