Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.04088
Cited By
Mixtral of Experts
8 January 2024
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
Chris Bamford
Devendra Singh Chaplot
Diego de Las Casas
Emma Bou Hanna
Florian Bressand
Gianna Lengyel
Guillaume Bour
Guillaume Lample
Lélio Renard Lavaud
Lucile Saulnier
Marie-Anne Lachaux
Pierre Stock
Sandeep Subramanian
Sophia Yang
Szymon Antoniak
Teven Le Scao
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mixtral of Experts"
50 / 208 papers shown
Title
Spotlight Your Instructions: Instruction-following with Dynamic Attention Steering
Praveen Venkateswaran
Danish Contractor
LLMSV
LRM
21
0
0
17 May 2025
MedGUIDE: Benchmarking Clinical Decision-Making in Large Language Models
Xiaomin Li
Mingye Gao
Yuexing Hao
Taoran Li
Guangya Wan
Zihan Wang
Yijun Wang
LM&MA
ELM
AI4MH
12
0
0
16 May 2025
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production
C. Jin
Ziheng Jiang
Zhihao Bai
Zheng Zhong
Jing Liu
...
Yanghua Peng
Xuanzhe Liu
Xuanzhe Liu
Xin Jin
Xin Liu
MoE
7
0
0
16 May 2025
CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs
Sijia Chen
Xiaomin Li
Mengxue Zhang
Eric Hanchen Jiang
Qingcheng Zeng
Chen-Hsiang Yu
AAML
MU
ELM
27
0
0
16 May 2025
SAS-Bench: A Fine-Grained Benchmark for Evaluating Short Answer Scoring with Large Language Models
Peichao Lai
Kaipeng Zhang
Yi Lin
L. Zhang
Feiyang Ye
...
Zifei Shan
Zeang Sheng
Yansen Wang
Wentao Zhang
Bin Cui
ELM
LRM
47
0
0
12 May 2025
Assessing and Mitigating Medical Knowledge Drift and Conflicts in Large Language Models
Weiyi Wu
Xinwen Xu
Chongyang Gao
Xingjian Diao
Siting Li
Lucas A. Salas
Jiang Gui
26
0
0
12 May 2025
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning
Zhehao Zhang
Weijie Xu
Fanyou Wu
Chandan K. Reddy
29
0
0
12 May 2025
POISONCRAFT: Practical Poisoning of Retrieval-Augmented Generation for Large Language Models
Yangguang Shao
Xinjie Lin
Haozheng Luo
Chengshang Hou
G. Xiong
Jiahao Yu
Junzheng Shi
SILM
52
0
0
10 May 2025
QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration
HamidReza Imani
Jiaxin Peng
Peiman Mohseni
Abdolah Amirany
Tarek A. El-Ghazawi
MoE
31
0
0
10 May 2025
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung
Jiachen Liu
Jeff J. Ma
Ruofan Wu
Oh Jun Kweon
Yuxuan Xia
Zhiyu Wu
Mosharaf Chowdhury
31
0
0
09 May 2025
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
Haojie Duanmu
Xiuhong Li
Zhihang Yuan
Size Zheng
Jiangfei Duan
Xingcheng Zhang
Dahua Lin
MQ
MoE
203
0
0
09 May 2025
Camera Control at the Edge with Language Models for Scene Understanding
Alexiy Buynitsky
Sina Ehsani
Bhanu Pallakonda
Pragyana Mishra
VLM
40
0
0
09 May 2025
Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization
Ajwad Abrar
Farzana Tabassum
Sabbir Ahmed
LM&MA
ELM
AI4MH
48
0
0
08 May 2025
GroverGPT-2: Simulating Grover's Algorithm via Chain-of-Thought Reasoning and Quantum-Native Tokenization
Min Chen
Jinglei Cheng
Pingzhi Li
Haoran Wang
Tianlong Chen
Junyu Liu
LRM
48
0
0
08 May 2025
FLAM: Frame-Wise Language-Audio Modeling
Yusong Wu
Christos Tsirigotis
Ke Chen
Cheng-Zhi Anna Huang
Rameswar Panda
Oriol Nieto
Prem Seetharaman
Justin Salamon
50
0
0
08 May 2025
Faster MoE LLM Inference for Extremely Large Models
Haoqi Yang
Luohe Shi
Qiwei Li
Zuchao Li
Ping Wang
Bo Du
Mengjia Shen
Hai Zhao
MoE
68
0
0
06 May 2025
Improving Model Alignment Through Collective Intelligence of Open-Source LLMS
Junlin Wang
Roy Xie
Shang Zhu
Jue Wang
Ben Athiwaratkun
Bhuwan Dhingra
Shuaiwen Leon Song
Ce Zhang
James Zou
ALM
38
0
0
05 May 2025
34 Examples of LLM Applications in Materials Science and Chemistry: Towards Automation, Assistants, Agents, and Accelerated Scientific Discovery
Yoel Zimmermann
Adib Bazgir
Alexander H Al-Feghali
Mehrad Ansari
L. C. Brinson
...
Shang Zhu
Jan Janssen
Calvin Li
Ian Foster
Ben Blaiszik
64
0
0
05 May 2025
AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks
Ilyas Oulkadda
Julien Perez
ALM
47
0
0
05 May 2025
An overview of artificial intelligence in computer-assisted language learning
Anisia Katinskaia
35
0
0
04 May 2025
Backdoor Attacks Against Patch-based Mixture of Experts
Cedric Chan
Jona te Lintelo
S. Picek
AAML
MoE
187
0
0
03 May 2025
MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance
Xing Hu
Zhixuan Chen
Dawei Yang
Zukang Xu
Chen Xu
Zhihang Yuan
Sifan Zhou
Jiangyong Yu
MoE
MQ
44
0
0
02 May 2025
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
Piotr Piekos
Róbert Csordás
Jürgen Schmidhuber
MoE
VLM
106
1
0
01 May 2025
Memorization and Knowledge Injection in Gated LLMs
Xu Pan
Ely Hahami
Zechen Zhang
H. Sompolinsky
KELM
CLL
RALM
106
1
0
30 Apr 2025
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
Zechuan Zhang
Ji Xie
Yu Lu
Zongxin Yang
Yuqing Yang
DiffM
97
1
0
29 Apr 2025
Accelerating Mixture-of-Experts Training with Adaptive Expert Replication
Athinagoras Skiadopoulos
Mark Zhao
Swapnil Gandhi
Thomas Norrie
Shrijeet Mukherjee
Christos Kozyrakis
MoE
91
0
0
28 Apr 2025
Mapping the Italian Telegram Ecosystem: Communities, Toxicity, and Hate Speech
Lorenzo Alvisi
S. Tardelli
Maurizio Tesconi
200
0
0
28 Apr 2025
Towards Robust Dialogue Breakdown Detection: Addressing Disruptors in Large Language Models with Self-Guided Reasoning
Abdellah Ghassel
Xianzhi Li
Xiaodan Zhu
51
0
0
26 Apr 2025
The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes
Wencong You
Daniel Lowd
39
0
0
24 Apr 2025
Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model
Junshu Pan
Wei Shen
Shulin Huang
Qiji Zhou
Yue Zhang
74
0
0
22 Apr 2025
LLM-as-a-Judge: Reassessing the Performance of LLMs in Extractive QA
Xanh Ho
Jiahao Huang
Florian Boudin
Akiko Aizawa
ELM
36
0
0
16 Apr 2025
2D-Curri-DPO: Two-Dimensional Curriculum Learning for Direct Preference Optimization
Mengyang Li
Zhong Zhang
27
0
0
10 Apr 2025
Generative Large Language Model usage in Smart Contract Vulnerability Detection
Peter Ince
Jiangshan Yu
Joseph K. Liu
Xiaoning Du
37
0
0
07 Apr 2025
Are you really listening? Boosting Perceptual Awareness in Music-QA Benchmarks
Yongyi Zang
Sean O'Brien
Taylor Berg-Kirkpatrick
Julian McAuley
Zachary Novack
AuLLM
94
1
0
01 Apr 2025
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
ALM
75
0
0
01 Apr 2025
Assessing Code Understanding in LLMs
Cosimo Laneve
Alvise Spanò
Dalila Ressi
S. Rossi
M. Bugliesi
50
0
0
31 Mar 2025
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models
Zhanke Zhou
Zhaocheng Zhu
Xuan Li
Mikhail Galkin
Xiao Feng
Sanmi Koyejo
Jian Tang
Bo Han
LRM
58
0
0
28 Mar 2025
Harmonia: A Multi-Agent Reinforcement Learning Approach to Data Placement and Migration in Hybrid Storage Systems
Rakesh Nadig
Vamanan Arulchelvan
Rahul Bera
Taha Shahroodi
Gagandeep Singh
Mohammad Sadrosadati
Jisung Park
O. Mutlu
Onur Mutlu
68
0
0
26 Mar 2025
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning
Hao Cui
Zahra Shamsi
Gowoon Cheon
Xuejian Ma
Shutong Li
...
Eun-Ah Kim
M. Brenner
Viren Jain
Sameera Ponda
Subhashini Venugopalan
ELM
LRM
57
0
0
14 Mar 2025
Large Language Models for Outpatient Referral: Problem Definition, Benchmarking and Challenges
Xiaoxiao Liu
Qingying Xiao
Junying Chen
Xiangyi Feng
Xiangbo Wu
...
Xiang Wan
Jian Chang
Guangjun Yu
Yan Hu
Benyou Wang
LM&MA
LRM
206
0
0
11 Mar 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
153
2
0
10 Mar 2025
From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development -- An Opinion Paper
Sargam Yadav
Asifa Mehmood Qureshi
Abhishek Kaushik
Shubham Sharma
Roisin Loughran
...
. Nikhil Singh
Padraic O'Hara
Pranay Jaiswal
Roshan Chandru
David Lillis
56
1
0
10 Mar 2025
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Ling Team
B. Zeng
Chenyu Huang
Chao Zhang
Changxin Tian
...
Zhaoxin Huan
Zujie Wen
Zhenhang Sun
Zhuoxuan Du
Z. He
MoE
ALM
111
2
0
07 Mar 2025
CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory
Jiashun Suo
Xiaojian Liao
Limin Xiao
Li Ruan
Jinquan Wang
Xiao Su
Zhisheng Huo
72
0
0
04 Mar 2025
A Pilot Empirical Study on When and How to Use Knowledge Graphs as Retrieval Augmented Generation
Xujie Yuan
Y. Liu
Shimin Di
Shiwen Wu
Libin Zheng
Rui Meng
Lei Chen
Xiaofang Zhou
Jian Yin
36
0
0
28 Feb 2025
Enhancing the Scalability and Applicability of Kohn-Sham Hamiltonians for Molecular Systems
Yunyang Li
Zaishuo Xia
Lin Huang
Xinran Wei
Han Yang
...
Zun Wang
Chang-Shu Liu
Jia Zhang
Bin Shao
Mark B. Gerstein
77
0
0
26 Feb 2025
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Taishi Nakamura
Takuya Akiba
Kazuki Fujii
Yusuke Oda
Rio Yokota
Jun Suzuki
MoMe
MoE
94
1
0
26 Feb 2025
Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models
Raeid Saqur
Anastasis Kratsios
Florian Krach
Yannick Limmer
Jacob-Junqi Tian
John Willes
Blanka Horvath
Frank Rudzicz
MoE
53
0
0
24 Feb 2025
Unveiling Scoring Processes: Dissecting the Differences between LLMs and Human Graders in Automatic Scoring
Xuansheng Wu
Padmaja Pravin Saraf
Gyeong-Geon Lee
Ehsan Latif
Ninghao Liu
Xiaoming Zhai
60
4
0
24 Feb 2025
LightMamba: Efficient Mamba Acceleration on FPGA with Quantization and Hardware Co-design
Renjie Wei
Songqiang Xu
Linfeng Zhong
Zebin Yang
Qingyu Guo
Yidan Wang
Runsheng Wang
Meng Li
84
0
0
24 Feb 2025
1
2
3
4
5
Next