Title |
---|
![]() Mixtral of Experts Albert Q. Jiang Alexandre Sablayrolles Antoine Roux A. Mensch Blanche Savary ...Théophile Gervet Thibaut Lavril Thomas Wang Timothée Lacroix William El Sayed |
![]() Context Autoencoder for Self-Supervised Representation Learning Xiaokang Chen Mingyu Ding Xiaodi Wang Ying Xin Shentong Mo Yunhao Wang Shumin Han Ping Luo Gang Zeng Jingdong Wang |
![]() Large Batch Optimization for Deep Learning: Training BERT in 76 minutes Yang You Jing Li Sashank J. Reddi Jonathan Hseu Sanjiv Kumar Srinadh Bhojanapalli Xiaodan Song J. Demmel Kurt Keutzer Cho-Jui Hsieh |