Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.09864
Cited By
v1
v2
v3
v4
v5 (latest)
RoFormer: Enhanced Transformer with Rotary Position Embedding
20 April 2021
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"RoFormer: Enhanced Transformer with Rotary Position Embedding"
50 / 250 papers shown
Title
TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Xuying Zhang
Yutong Liu
Yangguang Li
Renrui Zhang
Yong Liu
...
Wanli Ouyang
Zhiwei Xiong
Peng Gao
Qibin Hou
Ming-Ming Cheng
215
3
0
13 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
131
1
0
13 Mar 2025
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation
Chen Chen
Rui Qian
Wenze Hu
Tsu-Jui Fu
Jialing Tong
...
Lezhi Li
Bowen Zhang
Alex Schwing
Wei Liu
Yue Yang
136
0
0
13 Mar 2025
ASIDE: Architectural Separation of Instructions and Data in Language Models
Egor Zverev
Evgenii Kortukov
Alexander Panfilov
Soroush Tabesh
Alexandra Volkova
Sebastian Lapuschkin
Wojciech Samek
Christoph H. Lampert
AAML
117
2
0
13 Mar 2025
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola
Aaron Gokaslan
Justin T Chiu
Zhihan Yang
Zhixuan Qi
Jiaqi Han
Subham Sekhar Sahoo
Volodymyr Kuleshov
DiffM
257
25
0
12 Mar 2025
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in Clutter
Kechun Xu
Xunlong Xia
Kaixuan Wang
Yifei Yang
Yunxuan Mao
Bing Deng
R. Xiong
Yansen Wang
OffRL
169
0
0
12 Mar 2025
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
Yuchen Yan
Yongliang Shen
Yuhang Liu
Jin Jiang
Hao Fei
Jian Shao
Yueting Zhuang
LRM
ReLM
129
9
0
09 Mar 2025
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard
Hippolyte Gisserot-Boukhlef
Duarte M. Alves
André F. T. Martins
Ayoub Hammal
...
Maxime Peyrard
Nuno M. Guerreiro
Patrick Fernandes
Ricardo Rei
Pierre Colombo
507
3
0
07 Mar 2025
EDM: Efficient Deep Feature Matching
Xi Li
Tong Rao
Cihui Pan
81
0
0
07 Mar 2025
Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions
Emmy Liu
Amanda Bertsch
Lintang Sutawika
Lindia Tjuatja
Patrick Fernandes
...
Siyang Song
Carolin (Haas) Lawrence
Aditi Raghunathan
Kiril Gashteovski
Graham Neubig
251
2
0
05 Mar 2025
ArticuBot: Learning Universal Articulated Object Manipulation Policy via Large Scale Simulation
Yufei Wang
Ziyu Wang
Mino Nakura
Pratik Bhowal
Chia-Liang Kuo
Yi-Ting Chen
Zackory M. Erickson
David Held
137
0
0
04 Mar 2025
Remasking Discrete Diffusion Models with Inference-Time Scaling
Guanghan Wang
Yair Schiff
Subham Sekhar Sahoo
Volodymyr Kuleshov
DiffM
143
16
0
01 Mar 2025
UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering
Liu Liu
Shilei Liu
Yujin Yuan
Yanzhe Zhang
Bencheng Yan
...
Di Wang
Wenbo Su
Pengjie Wang
Jian Xu
Bo Zheng
103
1
0
26 Feb 2025
Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models
Andrew DiGiugno
Ausif Mahmood
103
0
0
24 Feb 2025
Navigation-GPT: A Robust and Adaptive Framework Utilizing Large Language Models for Navigation Applications
Feng Ma
Xiang Wang
Chen Chen
Xiao-bin Xu
Xin-ping Yan
460
0
0
23 Feb 2025
KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse
Jingbo Yang
Bairu Hou
Wei Wei
Yujia Bao
Shiyu Chang
VLM
164
3
0
21 Feb 2025
SpinQuant: LLM quantization with learned rotations
Zechun Liu
Changsheng Zhao
Igor Fedorov
Bilge Soran
Dhruv Choudhary
Raghuraman Krishnamoorthi
Vikas Chandra
Yuandong Tian
Tijmen Blankevoort
MQ
236
124
0
21 Feb 2025
Smaller But Better: Unifying Layout Generation with Smaller Large Language Models
Peirong Zhang
Jiaxin Zhang
Jiahuan Cao
Hongliang Li
Lianwen Jin
65
0
0
21 Feb 2025
Neural Attention Search
Difan Deng
Marius Lindauer
137
0
0
21 Feb 2025
Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning
Weitai Kang
Haifeng Huang
Yuzhang Shang
Mubarak Shah
Yan Yan
100
9
0
21 Feb 2025
Slamming: Training a Speech Language Model on One GPU in a Day
Gallil Maimon
Avishai Elmakies
Yossi Adi
77
3
0
19 Feb 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu Cheng
KELM
147
5
0
19 Feb 2025
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
Seanie Lee
Dong Bok Lee
Dominik Wagner
Minki Kang
Haebin Seong
Tobias Bocklet
Juho Lee
Sung Ju Hwang
82
2
0
18 Feb 2025
VRoPE: Rotary Position Embedding for Video Large Language Models
Zikang Liu
Longteng Guo
Yepeng Tang
Tongtian Yue
Junxian Cai
Kai Ma
Qingbin Liu
Xi Chen
Jing Liu
96
1
0
17 Feb 2025
Strada-LLM: Graph LLM for traffic prediction
Seyed Mohamad Moghadas
Yangxintong Lyu
Bruno Cornelis
Alexandre Alahi
Adrian Munteanu
AI4TS
91
1
0
17 Feb 2025
AAKT: Enhancing Knowledge Tracing with Alternate Autoregressive Modeling
Hao Zhou
Wenge Rong
Jianfei Zhang
Qing Sun
Y. Ouyang
Zhang Xiong
AI4Ed
KELM
111
0
0
17 Feb 2025
FeaKM: Robust Collaborative Perception under Noisy Pose Conditions
Jiuwu Hao
Liguo Sun
Ti Xiang
Yuting Wan
Haolin Song
Pin Lv
125
0
0
16 Feb 2025
Phantom: Subject-consistent video generation via cross-modal alignment
Lijie Liu
Tianxiang Ma
Bingchuan Li
Zhuowei Chen
Jiawei Liu
Qian He
Xinglong Wu
Qian He
Xinglong Wu
DiffM
VGen
162
14
0
16 Feb 2025
Large Language Diffusion Models
Shen Nie
Fengqi Zhu
Zebin You
Xiaolu Zhang
Jingyang Ou
Jun Hu
Jun Zhou
Yankai Lin
Ji-Rong Wen
Chongxuan Li
235
54
0
14 Feb 2025
AttentionSmithy: A Modular Framework for Rapid Transformer Development and Customization
Caleb Cranney
Jesse G. Meyer
146
0
0
13 Feb 2025
LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation
Zican Dong
Junyi Li
Jinhao Jiang
Mingyu Xu
Wayne Xin Zhao
Bin Wang
Xin Wu
VLM
339
5
0
11 Feb 2025
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
Xu Huang
Wenhao Zhu
Hanxu Hu
Zeang Sheng
Lei Li
Shujian Huang
Fei Yuan
ELM
128
4
0
11 Feb 2025
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
Hangliang Ding
Dacheng Li
Runlong Su
Peiyuan Zhang
Zhijie Deng
Ion Stoica
Hao Zhang
VGen
123
9
0
10 Feb 2025
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM
Qingshui Gu
Shu Li
Tianyu Zheng
Zhaoxiang Zhang
483
0
0
10 Feb 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
Sumin An
Junyoung Sung
Wonpyo Park
Chanjun Park
Paul Hongsuck Seo
214
0
0
10 Feb 2025
Provably Overwhelming Transformer Models with Designed Inputs
Lev Stambler
Seyed Sajjad Nezhadi
Matthew Coudron
124
1
0
09 Feb 2025
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
181
0
0
06 Feb 2025
Conversation AI Dialog for Medicare powered by Finetuning and Retrieval Augmented Generation
Atharva Mangeshkumar Agrawal
Rutika Pandurang Shinde
Vasanth Kumar Bhukya
Ashmita Chakraborty
Sagar Bharat Shah
Tanmay Shukla
Sree Pradeep Kumar Relangi
Nilesh Mutyam
LM&MA
AI4MH
123
0
0
04 Feb 2025
Explaining Context Length Scaling and Bounds for Language Models
Jingzhe Shi
Qinwei Ma
Hongyi Liu
Hang Zhao
Jeng-Neng Hwang
Lei Li
LRM
227
3
0
03 Feb 2025
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Gaojie Lin
Jianwen Jiang
Jiaqi Yang
Zerong Zheng
Chao Liang
DiffM
VGen
331
29
0
03 Feb 2025
Comply: Learning Sentences with Complex Weights inspired by Fruit Fly Olfaction
Alexei Figueroa
Justus Westerhoff
Golzar Atefi
Dennis Fast
B. Winter
Felix Alexader Gers
Alexander Loser
Wolfang Nejdl
171
0
0
03 Feb 2025
Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
Go Kamoda
Benjamin Heinzerling
Tatsuro Inaba
Keito Kudo
Keisuke Sakaguchi
Kentaro Inui
MILM
107
3
0
27 Jan 2025
AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement
Junan Zhang
Jing Yang
Zihao Fang
Yansen Wang
Zehua Zhang
Zhuo Wang
Fan Fan
Zhikai Wu
DiffM
125
4
0
26 Jan 2025
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Zhan Ling
Kang Liu
Kai Yan
Yue Yang
Weijian Lin
Ting-Han Fan
Lingfeng Shen
Zhengyin Du
Jiecao Chen
ReLM
ELM
LRM
91
8
0
25 Jan 2025
NExtLong: Toward Effective Long-Context Training without Long Documents
Chaochen Gao
Xing Wu
Zijia Lin
Debing Zhang
Songlin Hu
SyDa
152
2
0
22 Jan 2025
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
Thibaut Thonet
Jos Rozen
Laurent Besacier
RALM
203
3
0
20 Jan 2025
Generative Retrieval for Book search
Yubao Tang
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Shihao Liu
Shuaiqiang Wang
Dawei Yin
Xueqi Cheng
RALM
141
0
0
19 Jan 2025
Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding
Ziyang Chen
Mingxiao Li
Zhongfu Chen
Nan Du
Xiaolong Li
Yuexian Zou
121
1
0
19 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
280
26
0
17 Jan 2025
Tensor Product Attention Is All You Need
Yifan Zhang
Yifeng Liu
Huizhuo Yuan
Zhen Qin
Yang Yuan
Q. Gu
Andrew Chi-Chih Yao
159
14
0
11 Jan 2025
Previous
1
2
3
4
5
Next