
The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval
Papers citing "The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval"
3 / 3 papers shown
Title |
---|
![]() Qwen Technical Report Jinze Bai Shuai Bai Yunfei Chu Zeyu Cui Kai Dang ...Zhenru Zhang Chang Zhou Jingren Zhou Xiaohuan Zhou Tianhang Zhu |