Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.04088
Cited By
Mixtral of Experts
8 January 2024
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
Chris Bamford
Devendra Singh Chaplot
Diego de Las Casas
Emma Bou Hanna
Florian Bressand
Gianna Lengyel
Guillaume Bour
Guillaume Lample
Lélio Renard Lavaud
Lucile Saulnier
Marie-Anne Lachaux
Pierre Stock
Sandeep Subramanian
Sophia Yang
Szymon Antoniak
Teven Le Scao
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mixtral of Experts"
50 / 208 papers shown
Title
RLTHF: Targeted Human Feedback for LLM Alignment
Yifei Xu
Tusher Chakraborty
Emre Kıcıman
Bibek Aryal
Eduardo Rodrigues
...
Rafael Padilha
Leonardo Nunes
Shobana Balakrishnan
Songwu Lu
Ranveer Chandra
118
1
0
24 Feb 2025
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
Lijun Li
Zhelun Shi
Xuhao Hu
Bowen Dong
Yiran Qin
Xihui Liu
Lu Sheng
Jing Shao
114
1
0
21 Feb 2025
Stacking as Accelerated Gradient Descent
Naman Agarwal
Pranjal Awasthi
Satyen Kale
Eric Zhao
ODL
73
2
0
20 Feb 2025
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
Zhili Liu
Yunhao Gou
Kai Chen
Lanqing Hong
Jiahui Gao
...
Yu Zhang
Zhenguo Li
Xin Jiang
Qiang Liu
James T. Kwok
MoE
105
9
0
20 Feb 2025
How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark
Ruizhong Qiu
Weiliang Will Zeng
Hanghang Tong
James Ezick
Christopher Lott
88
16
0
20 Feb 2025
Mixup Regularization: A Probabilistic Perspective
Yousef El-Laham
Niccolò Dalmasso
Svitlana Vyetrenko
Vamsi K. Potluru
Manuela Veloso
56
0
0
20 Feb 2025
Can Your Uncertainty Scores Detect Hallucinated Entity?
Min-Hsuan Yeh
Max Kamachee
Seongheon Park
Yixuan Li
HILM
55
1
0
17 Feb 2025
Has My System Prompt Been Used? Large Language Model Prompt Membership Inference
Roman Levin
Valeriia Cherepanova
Abhimanyu Hans
Avi Schwarzschild
Tom Goldstein
185
1
0
14 Feb 2025
Assessing the Impact of the Quality of Textual Data on Feature Representation and Machine Learning Models
Tabinda Sarwar
Antonio Jose Jimeno Yepes
Lawrence Cavedon
69
0
0
12 Feb 2025
Importance Sampling via Score-based Generative Models
Heasung Kim
Taekyun Lee
Hyeji Kim
Gustavo de Veciana
MedIm
DiffM
141
0
0
07 Feb 2025
Brief analysis of DeepSeek R1 and its implications for Generative AI
Sarah Mercer
Samuel Spillard
Daniel P. Martin
76
13
0
04 Feb 2025
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
Pierre Ablin
Angelos Katharopoulos
Skyler Seto
David Grangier
MoMe
50
0
0
03 Feb 2025
What is a Number, That a Large Language Model May Know It?
Raja Marjieh
Veniamin Veselovsky
Thomas L. Griffiths
Ilia Sucholutsky
200
2
0
03 Feb 2025
Position: AI Scaling: From Up to Down and Out
Yunke Wang
Yanxi Li
Chang Xu
HAI
88
2
0
02 Feb 2025
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?
Wenzhe Li
Yong Lin
Mengzhou Xia
Chi Jin
MoE
99
2
0
02 Feb 2025
CoDocBench: A Dataset for Code-Documentation Alignment in Software Maintenance
Kunal Suresh Pai
Premkumar Devanbu
Toufique Ahmed
71
1
0
01 Feb 2025
Survey and Improvement Strategies for Gene Prioritization with Large Language Models
Matthew Neeley
Guantong Qi
Guanchu Wang
Ruixiang Tang
Dongxue Mao
...
Bo Yuan
Fan Xia
Pengfei Liu
Zhandong Liu
Xia Hu
LM&MA
104
0
0
30 Jan 2025
GuardReasoner: Towards Reasoning-based LLM Safeguards
Yue Liu
Hongcheng Gao
Shengfang Zhai
Jun Xia
Tianyi Wu
Zhiwei Xue
Yuxiao Chen
Kenji Kawaguchi
Jiaheng Zhang
Bryan Hooi
AI4TS
LRM
131
14
0
30 Jan 2025
Diverse Preference Optimization
Jack Lanchantin
Angelica Chen
S. Dhuliawala
Ping Yu
Jason Weston
Sainbayar Sukhbaatar
Ilia Kulikov
93
4
0
30 Jan 2025
When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search
Xuan Chen
Yuzhou Nie
Wenbo Guo
Xiangyu Zhang
112
10
0
28 Jan 2025
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
Changhun Lee
Jun-gyu Jin
Younghyun Cho
Eunhyeok Park
LRM
56
0
0
28 Jan 2025
360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation
Hamed Firooz
Maziar Sanjabi
Adrian Englhardt
Aman Gupta
Ben Levine
...
Xiaoling Zhai
Ya Xu
Yu Wang
Yun Dai
Yun Dai
ALM
47
3
0
27 Jan 2025
ZETA: Leveraging Z-order Curves for Efficient Top-k Attention
Qiuhao Zeng
Jerry Huang
Peng Lu
Gezheng Xu
Boxing Chen
Charles Ling
Boyu Wang
54
1
0
24 Jan 2025
LLMs as Repositories of Factual Knowledge: Limitations and Solutions
Seyed Mahed Mousavi
Simone Alghisi
Giuseppe Riccardi
KELM
49
0
0
22 Jan 2025
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
Zihan Qiu
Zeyu Huang
Jian Xu
Kaiyue Wen
Zhaoxiang Wang
Rui Men
Ivan Titov
Dayiheng Liu
Jingren Zhou
Junyang Lin
MoE
57
6
0
21 Jan 2025
Aligning Instruction Tuning with Pre-training
Yiming Liang
Tianyu Zheng
Xinrun Du
Ge Zhang
Jiaheng Liu
...
Zhaoxiang Zhang
Wenhao Huang
Jiajun Zhang
Xiang Yue
Jiajun Zhang
93
1
0
16 Jan 2025
iServe: An Intent-based Serving System for LLMs
Dimitrios Liakopoulos
Tianrui Hu
Prasoon Sinha
N. Yadwadkar
VLM
211
0
0
08 Jan 2025
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
Shanghaoran Quan
Jiaxi Yang
Bowen Yu
Jian Xu
Dayiheng Liu
...
Zeyu Cui
Yang Fan
Wenjie Qu
Binyuan Hui
Junyang Lin
ALM
ELM
LRM
74
16
0
02 Jan 2025
DAOP: Data-Aware Offloading and Predictive Pre-Calculation for Efficient MoE Inference
Yujie Zhang
Shivam Aggarwal
T. Mitra
MoE
79
0
0
16 Dec 2024
Unifying KV Cache Compression for Large Language Models with LeanKV
Yanqi Zhang
Yuwei Hu
Runyuan Zhao
John C. S. Lui
Haibo Chen
MQ
151
5
0
04 Dec 2024
Time-Reversal Provides Unsupervised Feedback to LLMs
Yerram Varun
Rahul Madhavan
Sravanti Addepalli
A. Suggala
Karthikeyan Shanmugam
Prateek Jain
LRM
SyDa
64
0
0
03 Dec 2024
Yi-Lightning Technical Report
01. AI
:
Alan Wake
Albert Wang
Bei Chen
...
Yuxuan Sha
Zhaodong Yan
Zhiyuan Liu
Zirui Zhang
Zonghong Dai
OSLM
102
3
0
02 Dec 2024
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Jiange Yang
Haoyi Zhu
Yanjie Wang
Gangshan Wu
Tong He
Limin Wang
103
2
0
21 Nov 2024
Collective Model Intelligence Requires Compatible Specialization
Jyothish Pari
Samy Jelassi
Pulkit Agrawal
MoMe
51
1
0
04 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
69
6
0
31 Oct 2024
AAAR-1.0: Assessing AI's Potential to Assist Research
Renze Lou
Hanzi Xu
Sijia Wang
Jiangshu Du
Ryo Kamoi
...
Xi Li
Kaipeng Zhang
Congying Xia
Lifu Huang
Wenpeng Yin
43
5
0
29 Oct 2024
R-LLaVA: Improving Med-VQA Understanding through Visual Region of Interest
Xupeng Chen
Zhixin Lai
Kangrui Ruan
Shichu Chen
Jiaxiang Liu
Zuozhu Liu
41
1
0
27 Oct 2024
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
36
4
0
24 Oct 2024
ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts
Xumeng Han
Longhui Wei
Zhiyang Dou
Zipeng Wang
Chenhui Qiang
Xin He
Yingfei Sun
Zhenjun Han
Qi Tian
MoE
45
3
0
21 Oct 2024
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
34
13
0
15 Oct 2024
Quadratic Gating Functions in Mixture of Experts: A Statistical Insight
Pedram Akbarian
Huy Le Nguyen
Xing Han
Nhat Ho
MoE
42
0
0
15 Oct 2024
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks
Fangru Lin
Shaoguang Mao
Emanuele La Malfa
Valentin Hofmann
Adrian de Wynter
Jing Yao
Si-Qing Chen
Michael Wooldridge
Furu Wei
Furu Wei
51
2
0
14 Oct 2024
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Guorui Zheng
Xidong Wang
Juhao Liang
Nuo Chen
Yuping Zheng
Benyou Wang
MoE
35
5
0
14 Oct 2024
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models
Jun Luo
Cheng Chen
Shandong Wu
FedML
VLM
MoE
52
3
0
14 Oct 2024
TULIP: Token-length Upgraded CLIP
Ivona Najdenkoska
Mohammad Mahdi Derakhshani
Yuki M. Asano
Nanne van Noord
Marcel Worring
Cees G. M. Snoek
VLM
48
3
0
13 Oct 2024
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
99
16
0
11 Oct 2024
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Philipp Guldimann
Alexander Spiridonov
Robin Staab
Nikola Jovanović
Mark Vero
...
Mislav Balunović
Nikola Konstantinov
Pavol Bielik
Petar Tsankov
Martin Vechev
ELM
53
4
0
10 Oct 2024
Subtle Errors Matter: Preference Learning via Error-injected Self-editing
Kaishuai Xu
Tiezheng YU
Wenjun Hou
Yi Cheng
Chak Tou Leong
Liangyou Li
Xin Jiang
Lifeng Shang
Qun Liu
Wenjie Li
LRM
181
0
0
09 Oct 2024
RL, but don't do anything I wouldn't do
Michael K. Cohen
Marcus Hutter
Yoshua Bengio
Stuart J. Russell
OffRL
35
2
0
08 Oct 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
39
3
0
08 Oct 2024
Previous
1
2
3
4
5
Next