Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 2,193 papers shown
Title
Less is More? Revisiting the Importance of Frame Rate in Real-Time Zero-Shot Surgical Video Segmentation
Utku Ozbulak
Seyed Amir Mousavi
Francesca Tozzi
Nikdokht Rashidian
W. Willaert
W. D. Neve
J. Vankerschaver
81
0
0
28 Feb 2025
PaCA: Partial Connection Adaptation for Efficient Fine-Tuning
Sunghyeon Woo
Sol Namkung
Sunwoo Lee
Inho Jeong
Beomseok Kim
Dongsuk Jeon
105
1
0
28 Feb 2025
Vector-Quantized Vision Foundation Models for Object-Centric Learning
Rongzhen Zhao
V. Wang
Arno Solin
Joni Pajarinen
OCL
VLM
532
1
0
27 Feb 2025
Training Large Neural Networks With Low-Dimensional Error Feedback
Maher Hanut
Jonathan Kadmon
104
1
0
27 Feb 2025
Mixtera: A Data Plane for Foundation Model Training
Maximilian Böther
Xiaozhe Yao
Tolga Kerimoglu
Ana Klimovic
Viktor Gsteiger
Ana Klimovic
MoE
176
0
0
27 Feb 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
182
11
0
27 Feb 2025
Similarity-Distance-Magnitude Universal Verification
Allen Schmaltz
UQCV
AAML
532
0
0
27 Feb 2025
Large Language Models as Attribution Regularizers for Efficient Model Training
Davor Vukadin
Marin Šilić
Goran Delač
184
0
0
27 Feb 2025
Advancements in Natural Language Processing for Automatic Text Summarization
Nevidu Jayatilleke
Ruvan Weerasinghe
Nipuna Senanayake
360
1
0
27 Feb 2025
UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering
Liu Liu
Shilei Liu
Yujin Yuan
Yanzhe Zhang
Bencheng Yan
...
Di Wang
Wenbo Su
Pengjie Wang
Jian Xu
Bo Zheng
101
1
0
26 Feb 2025
CryptoPulse: Short-Term Cryptocurrency Forecasting with Dual-Prediction and Cross-Correlated Market Indicators
Amit Kumar
Taoran Ji
116
0
0
26 Feb 2025
CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition
Jiaming Zhou
Yujie Guo
Songtao Zhao
Haoqin Sun
Hui Wang
...
Shiyao Wang
Xi Yang
Yansen Wang
Yonghua Lin
Yong Qin
78
0
0
26 Feb 2025
A Sliding Layer Merging Method for Efficient Depth-Wise Pruning in LLMs
Xuan Ding
Rui Sun
Yunjian Zhang
Xiu Yan
Yueqi Zhou
Kaihao Huang
Suzhong Fu
Angelica I Aviles-Rivero
Chuanlong Xie
Yao Zhu
242
1
0
26 Feb 2025
BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction
Ruifeng Tan
Weixiang Hong
Jiayue Tang
Xibin Lu
Ruijun Ma
Xiang Zheng
Jia Li
Jiaqiang Huang
Tianze Zhang
AI4TS
120
1
0
26 Feb 2025
Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision
Che Liu
Yingji Zhang
D. Zhang
Weijie Zhang
Chenggong Gong
...
André Freitas
Qifan Wang
Z. Xu
Rongjuncheng Zhang
Yong Dai
AuLLM
219
2
0
26 Feb 2025
Can Large Language Models Extract Customer Needs as well as Professional Analysts?
Artem Timoshenko
Chengfeng Mao
J. Hauser
ELM
111
0
0
25 Feb 2025
From Small to Large Language Models: Revisiting the Federalist Papers
So Won Jeong
Veronika Rockova
188
0
0
25 Feb 2025
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
Yashan Wang
Shangda Wu
Jianhuai Hu
Xingjian Du
Yueqi Peng
Yongxin Huang
Shuai Fan
Xiaobing Li
Feng Yu
Maosong Sun
196
2
0
25 Feb 2025
Generative Artificial Intelligence: Evolving Technology, Growing Societal Impact, and Opportunities for Information Systems Research
Veda C. Storey
Wei Thoo Yue
J. Leon Zhao
Roman Lukyanenko
70
1
0
25 Feb 2025
Predicting Through Generation: Why Generation Is Better for Prediction
Md. Kowsher
Nusrat Jahan Prottasha
Prakash Bhat
Chun-Nam Yu
Mojtaba Soltanalian
Ivan Garibay
O. Garibay
Chen Chen
Niloofar Yousefi
AI4TS
216
1
0
25 Feb 2025
Function-Space Learning Rates
Edward Milsom
Ben Anson
Laurence Aitchison
136
1
0
24 Feb 2025
Vague Preference Policy Learning for Conversational Recommendation
Gangyi Zhang
Chongming Gao
Wenqiang Lei
Xiaojie Guo
Shijun Li
Hongshen Chen
Zhuozhi Ding
Sulong Xu
Lingfei Wu
106
1
0
24 Feb 2025
Constructing a Norm for Children's Scientific Drawing: Distribution Features Based on Semantic Similarity of Large Language Models
Yi Zhang
Fan Wei
Jiajian Li
Y. X. R. Wang
Yanyan Yu
...
Z. Cai
Xinyu Liu
Wei Wang
Peng Wang
Ziyi Wang
97
0
0
24 Feb 2025
IGN : Implicit Generative Networks
Haozheng Luo
Tianyi Wu
Feiyu Han
Zhijun Yan
OffRL
98
1
0
24 Feb 2025
Encryption-Friendly LLM Architecture
Donghwan Rho
Taeseong Kim
Minje Park
Jung Woo Kim
Hyunsik Chae
Jung Hee Cheon
Ernest K. Ryu
220
6
0
24 Feb 2025
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
Benedikt Alkin
Lukas Miklautz
Sepp Hochreiter
Johannes Brandstetter
VLM
235
8
0
24 Feb 2025
Dataset Featurization: Uncovering Natural Language Features through Unsupervised Data Reconstruction
Michal Bravansky
Vaclav Kubon
Suhas Hariharan
Robert Kirk
121
1
0
24 Feb 2025
ESIQA: Perceptual Quality Assessment of Vision-Pro-based Egocentric Spatial Images
Zhirui Kuai
Liu Yang
Huiyu Duan
Yuxing Han
Guoyu Tang
P. Callet
136
2
0
24 Feb 2025
Model Lakes
Koyena Pal
David Bau
Renée J. Miller
155
2
0
24 Feb 2025
PICASO: Permutation-Invariant Context Composition with State Space Models
Tian Yu Liu
Alessandro Achille
Matthew Trager
Aditya Golatkar
Luca Zancato
Stefano Soatto
LRM
132
0
0
24 Feb 2025
Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models
Andrew DiGiugno
Ausif Mahmood
100
0
0
24 Feb 2025
Yes, Q-learning Helps Offline In-Context RL
Denis Tarasov
Alexander Nikulin
Ilya Zisman
Albina Klepach
Andrei Polubarov
Nikita Lyubaykin
Alexander Derevyagin
Igor Kiselev
Vladislav Kurenkov
OffRL
OnRL
477
3
0
24 Feb 2025
NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models
Yibo Zhong
Haoxiang Jiang
Lincan Li
Ryumei Nakada
Tianci Liu
Linjun Zhang
Huaxiu Yao
Haoyu Wang
234
3
0
24 Feb 2025
GraphFM: Graph Factorization Machines for Feature Interaction Modeling
Shu Wu
Zekun Li
Yunyue Su
Zeyu Cui
Xiaoyu Zhang
Liang Wang
231
23
0
24 Feb 2025
ZeroPS: High-quality Cross-modal Knowledge Transfer for Zero-Shot 3D Part Segmentation
Yuheng Xue
Nenglun Chen
Jun Liu
Wenyun Sun
3DPC
220
7
0
24 Feb 2025
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving
Zeyu Yang
Nan Song
Wei Li
Xiatian Zhu
Lefei Zhang
Philip H. S. Torr
134
4
0
24 Feb 2025
Evolving Form and Function: Dual-Objective Optimization in Neural Symbolic Regression Networks
Amanda Bertschinger
James P. Bagrow
Joshua Bongard
142
1
0
24 Feb 2025
The Call for Socially Aware Language Technologies
Diyi Yang
Dirk Hovy
David Jurgens
Barbara Plank
VLM
132
12
0
24 Feb 2025
CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation
Vishal Thengane
Jean Lahoud
Hisham Cholakkal
Rao Muhammad Anwer
L. Yin
Xiatian Zhu
Salman Khan
CLL
454
0
0
24 Feb 2025
Measuring and Benchmarking Large Language Models' Capabilities to Generate Persuasive Language
Amalie Brogaard Pauli
Isabelle Augenstein
Ira Assent
100
9
0
24 Feb 2025
Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence
Wenzhe Yin
Zehao Xiao
Pan Zhou
Shujian Yu
Jiayi Shen
Jan-Jakob Sonke
E. Gavves
147
1
0
24 Feb 2025
Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements
I. Isozaki
Manil Shrestha
Rick Console
Edward Kim
ELM
111
7
0
24 Feb 2025
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
205
49
0
24 Feb 2025
Disentangling Visual Transformers: Patch-level Interpretability for Image Classification
Guillaume Jeanneret
Loïc Simon
F. Jurie
ViT
139
0
0
24 Feb 2025
Graph Perceiver IO: A General Architecture for Graph Structured Data
Seyun Bae
Hoyoon Byun
Changdae Oh
Yoon-Sik Cho
Kyungwoo Song
GNN
244
3
0
24 Feb 2025
MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Haoran Tang
Meng Cao
Jinfa Huang
Ruyang Liu
Peng Jin
Ge Li
Xiaodan Liang
Mamba
161
4
0
24 Feb 2025
Towards Foundation Models for Mixed Integer Linear Programming
Sirui Li
Janardhan Kulkarni
Ishai Menache
Cathy Wu
Beibin Li
106
10
0
24 Feb 2025
From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning
Pusen Dong
Tianchen Zhu
Yue Qiu
Haoyi Zhou
Jianxin Li
144
1
0
24 Feb 2025
Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification
Arshia Kermani
Ehsan Zeraatkar
Habib Irani
119
3
0
23 Feb 2025
Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System
Saikat Barua
Mostafizur Rahman
Md Jafor Sadek
Rafiul Islam
Shehnaz Khaled
Ahmedul Kabir
LLMAG
141
1
0
23 Feb 2025
Previous
1
2
3
...
10
11
12
...
42
43
44
Next