
Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward Networks
Papers citing "Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward Networks"
11 / 11 papers shown
Title |
---|
![]() MoEfication: Transformer Feed-forward Layers are Mixtures of Experts Zhengyan Zhang Yankai Lin Zhiyuan Liu Peng Li Maosong Sun Jie Zhou |