Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 27,180 papers shown
Title
TrainVerify: Equivalence-Based Verification for Distributed LLM Training
Yunchi Lu
Youshan Miao
Cheng Tan
Peng Huang
Yi Zhu
Xian Zhang
Fan Yang
LRM
26
0
0
19 Jun 2025
Do We Talk to Robots Like Therapists, and Do They Respond Accordingly? Language Alignment in AI Emotional Support
Sophie Chiang
Guy Laban
Hatice Gunes
14
0
0
19 Jun 2025
MoiréXNet: Adaptive Multi-Scale Demoiréing with Linear Attention Test-Time Training and Truncated Flow Matching Prior
Liangyan Li
Yimo Ning
Kevin Le
Wei Dong
Yunzhe Li
Jun Chen
Xiaohong Liu
15
0
0
19 Jun 2025
Streaming Non-Autoregressive Model for Accent Conversion and Pronunciation Improvement
Tuan-Nam Nguyen
Ngoc-Quan Pham
Seymanur Akti
Alexander Waibel
19
0
0
19 Jun 2025
AutoHFormer: Efficient Hierarchical Autoregressive Transformer for Time Series Prediction
Qianru Zhang
Honggang Wen
Ming Li
Dong Huang
Siu-Ming Yiu
Christian S. Jensen
Pietro Lio
AI4TS
16
0
0
19 Jun 2025
Noise Fusion-based Distillation Learning for Anomaly Detection in Complex Industrial Environments
Jiawen Yu
Jieji Ren
Yang Chang
Qiaojun Yu
Xuan Tong
Boyang Wang
Yan Song
You Li
Xinji Mai
Wenqiang Zhang
12
0
0
19 Jun 2025
Relational Deep Learning: Challenges, Foundations and Next-Generation Architectures
Vijay Prakash Dwivedi
Charilaos I. Kanatsoulis
Shenyang Huang
Jure Leskovec
GNN
3DV
32
0
0
19 Jun 2025
Goal-conditioned Hierarchical Reinforcement Learning for Sample-efficient and Safe Autonomous Driving at Intersections
Yiou Huang
12
0
0
19 Jun 2025
TD3Net: A Temporal Densely Connected Multi-Dilated Convolutional Network for Lipreading
B. Lee
Wooseok Shin
Sung Won Han
19
0
0
19 Jun 2025
A Free Probabilistic Framework for Analyzing the Transformer-based Language Models
Swagatam Das
10
0
0
19 Jun 2025
GeoGuess: Multimodal Reasoning based on Hierarchy of Visual Information in Street View
Fenghua Cheng
Jinxiang Wang
Sen Wang
Zi Huang
Xue Li
LRM
17
0
0
19 Jun 2025
Knee-Deep in C-RASP: A Transformer Depth Hierarchy
Andy Yang
Michaël Cadilhac
David Chiang
15
0
0
19 Jun 2025
OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents
Reyna Abhyankar
Qi Qi
Yiying Zhang
LLMAG
15
0
0
19 Jun 2025
ControlVLA: Few-shot Object-centric Adaptation for Pre-trained Vision-Language-Action Models
Puhao Li
Yingying Wu
Ziheng Xi
Wanlin Li
Yuzhe Huang
...
Yinghan Chen
Jianan Wang
Song-Chun Zhu
Tengyu Liu
Siyuan Huang
LM&Ro
10
0
0
19 Jun 2025
Dense 3D Displacement Estimation for Landslide Monitoring via Fusion of TLS Point Clouds and Embedded RGB Images
Zhaoyi Wang
Jemil Avers Butt
S. Huang
Tomislav Medic
A. Wieser
12
0
0
19 Jun 2025
Optimizing Multilingual Text-To-Speech with Accents & Emotions
Pranav Pawar
Akshansh Dwivedi
Jenish Boricha
Himanshu Gohil
Aditya Dubey
10
0
0
19 Jun 2025
Next-Token Prediction Should be Ambiguity-Sensitive: A Meta-Learning Perspective
Léo Gagnon
Eric Elmoznino
Sarthak Mittal
Tom Marty
Tejas Kasetty
Dhanya Sridhar
Guillaume Lajoie
10
0
0
19 Jun 2025
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models
Tianchen Zhao
Ke Hong
Xinhao Yang
Xuefeng Xiao
Huixia Li
...
Ruiqi Xie
Siqi Chen
Hongyu Zhu
Y. Zhang
Yu Wang
MQ
VGen
11
0
0
19 Jun 2025
Unpacking Generative AI in Education: Computational Modeling of Teacher and Student Perspectives in Social Media Discourse
Paulina DeVito
Akhil Vallala
Sean Mcmahon
Yaroslav Hinda
Benjamin Thaw
Hanqi Zhuang
Hari Kalva
10
0
0
19 Jun 2025
Towards Classifying Histopathological Microscope Images as Time Series Data
Sungrae Hong
HyeongMin Park
Y. Ko
Sol Lee
Bryan Wong
Mun Yi
7
0
0
19 Jun 2025
AeroGPT: Leveraging Large-Scale Audio Model for Aero-Engine Bearing Fault Diagnosis
Jiale Liu
Dandan Peng
Huan Wang
Chenyu Liu
Yan-Fu Li
Min Xie
10
0
0
19 Jun 2025
Universal Laboratory Model: prognosis of abnormal clinical outcomes based on routine tests
Pavel Karpov
Ilya Petrenkov
Ruslan Raiman
5
0
0
18 Jun 2025
CipherMind:The Longest Codebook in the World
Ming Nie
Zhixiong Yang
Bingsheng Wei
24
0
0
18 Jun 2025
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute
Sheng Liu
Tianlang Chen
Pan Lu
Haotian Ye
Yizheng Chen
Lei Xing
James Zou
ReLM
LRM
17
0
0
18 Jun 2025
Versatile Symbolic Music-for-Music Modeling via Function Alignment
Junyan Jiang
Daniel Y. Chin
Liwei Lin
Xuanjie Liu
Gus Xia
25
0
0
18 Jun 2025
Retrospective Memory for Camouflaged Object Detection
Chenxi Zhang
Jiayun Wu
Qing Zhang
Yazhe Zhai
Youwei Pang
12
0
0
18 Jun 2025
SecFwT: Efficient Privacy-Preserving Fine-Tuning of Large Language Models Using Forward-Only Passes
Jinglong Luo
Zhuo Zhang
Yehong Zhang
Shiyu Liu
Ye Dong
Xun Zhou
Hui Wang
Yue Yu
Zenglin Xu
12
0
0
18 Jun 2025
T-SHRED: Symbolic Regression for Regularization and Model Discovery with Transformer Shallow Recurrent Decoders
Alexey Yermakov
David Zoro
Mars Liyao Gao
J. Nathan Kutz
9
0
0
18 Jun 2025
A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals
Andrea Cadeddu
Alessandro Chessa
Vincenzo De Leo
Gianni Fenu
Enrico Motta
Francesco Osborne
Diego Reforgiato Recupero
Angelo Salatino
Luca Secchi
14
0
0
18 Jun 2025
HiPreNets: High-Precision Neural Networks through Progressive Training
Ethan Mulle
W. Kang
Q. Gong
20
0
0
18 Jun 2025
The Compositional Architecture of Regret in Large Language Models
Xiangxiang Cui
Shu Yang
Tianjin Huang
Wanyu Lin
Lijie Hu
Di Wang
24
0
0
18 Jun 2025
Efficient and Generalizable Environmental Understanding for Visual Navigation
Ruoyu Wang
Xinshu Li
Chen Wang
Lina Yao
CML
14
0
0
18 Jun 2025
Robust Instant Policy: Leveraging Student's t-Regression Model for Robust In-context Imitation Learning of Robot Manipulation
Hanbit Oh
Andrea M. Salcedo-Vázquez
I. Ramirez-Alpizar
Y. Domae
15
0
0
18 Jun 2025
Sampling 3D Molecular Conformers with Diffusion Transformers
J. Frank
Winfried Ripken
Gregor Lied
K. Müller
Oliver T. Unke
Stefan Chmiela
10
0
0
18 Jun 2025
InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
Minsoo Kim
Kyuhong Shim
Jungwook Choi
Simyung Chang
VLM
7
0
0
18 Jun 2025
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model
Anirud Aggarwal
Abhinav Shrivastava
M. Gwilliam
48
0
0
18 Jun 2025
Diffusion-based Counterfactual Augmentation: Towards Robust and Interpretable Knee Osteoarthritis Grading
Zhe Wang
Yuhua Ru
Aladine Chetouani
Tina Shiang
Fang Chen
...
Didier Hans
Rachid Jennane
William Ewing Palmer
Mohamed Jarraya
Yung Hsin Chen
MedIm
14
0
0
18 Jun 2025
Exploring and Exploiting the Inherent Efficiency within Large Reasoning Models for Self-Guided Efficiency Enhancement
Weixiang Zhao
Jiahe Guo
Yang Deng
Xingyu Sui
Yulin Hu
Yanyan Zhao
Wanxiang Che
Bing Qin
Tat-Seng Chua
Ting Liu
LRM
38
0
0
18 Jun 2025
Multimodal Large Language Models for Medical Report Generation via Customized Prompt Tuning
Chunlei Li
Jingyang Hou
Yilei Shi
Jingliang Hu
Xiao Xiang Zhu
Lichao Mou
LM&MA
28
0
0
18 Jun 2025
TACT: Humanoid Whole-body Contact Manipulation through Deep Imitation Learning with Tactile Modality
Masaki Murooka
Takahiro Hoshi
Kensuke Fukumitsu
Shimpei Masuda
Marwan Hamze
Tomoya Sasaki
Mitsuharu Morisawa
Eiichi Yoshida
12
0
0
18 Jun 2025
Echo-DND: A dual noise diffusion model for robust and precise left ventricle segmentation in echocardiography
Abdur Rahman
Keerthiveena Balraj
Manojkumar Ramteke
Anurag Singh Rathore
DiffM
MedIm
15
0
0
18 Jun 2025
Intrinsic and Extrinsic Organized Attention: Softmax Invariance and Network Sparsity
Oluwadamilola Fasina
Ruben V.C. Pohle
Pei-Chun Su
Ronald R. Coifman
12
0
0
18 Jun 2025
SynPo: Boosting Training-Free Few-Shot Medical Segmentation via High-Quality Negative Prompts
Yufei Liu
Haoke Xiao
Jiaxing Chai
Yongcun Zhang
Rong Wang
Zijie Meng
Zhiming Luo
MedIm
VLM
13
0
0
18 Jun 2025
MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents
Zijian Zhou
Ao Qu
Zhaoxuan Wu
Sunghwan Kim
Alok Prakash
Daniela Rus
Jinhua Zhao
Bryan Kian Hsiang Low
Paul Liang
LLMAG
OffRL
LRM
10
0
0
18 Jun 2025
From Model to Classroom: Evaluating Generated MCQs for Portuguese with Narrative and Difficulty Concerns
Bernardo Leite
Henrique Lopes Cardoso
Pedro Pinto
Abel Ferreira
Luís Abreu
Isabel Rangel
Sandra Monteiro
26
0
0
18 Jun 2025
From LLMs to MLLMs to Agents: A Survey of Emerging Paradigms in Jailbreak Attacks and Defenses within LLM Ecosystem
Yanxu Mao
Tiehan Cui
Peipei Liu
Datao You
Hongsong Zhu
AAML
12
0
0
18 Jun 2025
Managing Complex Failure Analysis Workflows with LLM-based Reasoning and Acting Agents
Aline Dobrovsky
Konstantin Schekotihin
Christian Burmer
LLMAG
20
0
0
18 Jun 2025
Multi-Interest Recommendation: A Survey
Zihao Li
Qiang Chen
Lixin Zou
Aixin Sun
Chenliang Li
AI4TS
14
0
0
18 Jun 2025
Zero-Shot Reinforcement Learning Under Partial Observability
Scott Jeen
Tom Bewley
Jonathan M. Cullen
OffRL
20
0
0
18 Jun 2025
Early Attentive Sparsification Accelerates Neural Speech Transcription
Zifei Xu
Sayeh Sharify
Hesham Mostafa
T. Webb
W. Yazar
Xin Wang
10
0
0
18 Jun 2025
Previous
1
2
3
4
5
...
542
543
544
Next