CoCA: Fusing Position Embedding with Collinear Constrained Attention in
  Transformers for Long Context Window Extending
v1v2v3 (latest)

CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending

Papers citing "CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending"