Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.04088
Cited By
Mixtral of Experts
8 January 2024
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
Chris Bamford
Devendra Singh Chaplot
Diego de Las Casas
Emma Bou Hanna
Florian Bressand
Gianna Lengyel
Guillaume Bour
Guillaume Lample
Lélio Renard Lavaud
Lucile Saulnier
Marie-Anne Lachaux
Pierre Stock
Sandeep Subramanian
Sophia Yang
Szymon Antoniak
Teven Le Scao
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mixtral of Experts"
50 / 208 papers shown
Title
Mixture-of-Agents Enhances Large Language Model Capabilities
Junlin Wang
Jue Wang
Ben Athiwaratkun
Ce Zhang
James Zou
LLMAG
AIFin
41
101
0
07 Jun 2024
LLM-based speaker diarization correction: A generalizable approach
Georgios Efstathiadis
Vijay Yadav
Anzar Abbas
45
3
0
07 Jun 2024
CORU: Comprehensive Post-OCR Parsing and Receipt Understanding Dataset
Abdelrahman Abdallah
Mahmoud Abdalla
M. Kasem
Mohamed Mahmoud
Ibrahim Abdelhalim
Mohamed Elkasaby
Yasser Elbendary
Adam Jatowt
31
0
0
06 Jun 2024
Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training
Ao Sun
Weilin Zhao
Xu Han
Cheng Yang
Zhiyuan Liu
Chuan Shi
Maosong Sun
31
7
0
05 Jun 2024
Brainstorming Brings Power to Large Language Models of Knowledge Reasoning
Zining Qin
Chenhao Wang
Huiling Qin
Weijia Jia
LRM
45
1
0
02 Jun 2024
OR-Bench: An Over-Refusal Benchmark for Large Language Models
Justin Cui
Wei-Lin Chiang
Ion Stoica
Cho-Jui Hsieh
ALM
38
35
0
31 May 2024
Is In-Context Learning Sufficient for Instruction Following in LLMs?
Hao Zhao
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
67
11
0
30 May 2024
Towards Hierarchical Multi-Agent Workflows for Zero-Shot Prompt Optimization
Yuchi Liu
Jaskirat Singh
Gaowen Liu
Ali Payani
Liang Zheng
LLMAG
82
4
0
30 May 2024
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee
Sue Hyun Park
Seungone Kim
Minjoon Seo
ALM
44
38
0
28 May 2024
Assessing LLMs Suitability for Knowledge Graph Completion
Vasile Ionut Iga
G. Silaghi
49
2
0
27 May 2024
Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs
Siyuan Guo
Aniket Didolkar
Nan Rosemary Ke
Anirudh Goyal
Ferenc Huszár
Bernhard Schölkopf
52
4
0
24 May 2024
The Mosaic Memory of Large Language Models
Igor Shilov
Matthieu Meeus
Yves-Alexandre de Montjoye
47
3
0
24 May 2024
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Yongxin Guo
Zhenglin Cheng
Xiaoying Tang
Tao R. Lin
Tao Lin
MoE
66
7
0
23 May 2024
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
52
2
0
22 May 2024
DirectMultiStep: Direct Route Generation for Multistep Retrosynthesis
Yu Shee
Haote Li
Anton Morgunov
Victor S. Batista
54
1
0
22 May 2024
SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Xingzhou Lou
Junge Zhang
Jian Xie
Lifeng Liu
Dong Yan
Kaiqi Huang
45
11
0
21 May 2024
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Yunxin Li
Shenyuan Jiang
Baotian Hu
Longyue Wang
Wanqi Zhong
Wenhan Luo
Lin Ma
Min-Ling Zhang
MoE
46
30
0
18 May 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
62
261
0
16 May 2024
Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis
Yao Fu
35
19
0
14 May 2024
DEPTH: Discourse Education through Pre-Training Hierarchically
Zachary Bamberger
Ofek Glick
Chaim Baskin
Yonatan Belinkov
67
0
0
13 May 2024
SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants
Masoud Moghani
Lars Doorenbos
Will Panitch
Sean Huver
Mahdi Azizian
Ken Goldberg
Animesh Garg
43
9
0
08 May 2024
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Yujun Lin
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
90
76
0
07 May 2024
When LLMs Meet Cybersecurity: A Systematic Literature Review
Jie Zhang
Haoyu Bu
Hui Wen
Yu Chen
Lun Li
Hongsong Zhu
45
36
0
06 May 2024
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection
Guillem Ramírez
Alexandra Birch
Ivan Titov
40
8
0
03 May 2024
Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns
Constantinos Patsakis
Fran Casino
Nikolaos Lykousas
47
13
0
30 Apr 2024
When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively
Tiziano Labruna
Jon Ander Campos
Gorka Azkune
26
10
0
30 Apr 2024
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
Parshin Shojaee
Kazem Meidani
Shashank Gupta
A. Farimani
Chandan K. Reddy
42
15
0
29 Apr 2024
Automating Customer Needs Analysis: A Comparative Study of Large Language Models in the Travel Industry
Simone Barandoni
F. Chiarello
Lorenzo Cascone
Emiliano Marrale
Salvatore Puccio
51
5
0
27 Apr 2024
Towards Adapting Open-Source Large Language Models for Expert-Level Clinical Note Generation
Hanyin Wang
Chufan Gao
Bolun Liu
Qiping Xu
Guleid Hussein
Mohamad El Labban
Kingsley Iheasirim
H. Korsapati
Chuck Outcalt
Jiashuo Sun
LM&MA
AI4MH
40
2
0
25 Apr 2024
Multi-Head Mixture-of-Experts
Xun Wu
Shaohan Huang
Wenhui Wang
Furu Wei
MoE
39
12
0
23 Apr 2024
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
84
46
0
23 Apr 2024
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues
Makesh Narsimhan Sreedhar
Traian Rebedea
Shaona Ghosh
Jiaqi Zeng
Christopher Parisien
ALM
35
4
0
04 Apr 2024
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Fanxu Meng
Zhaohui Wang
Muhan Zhang
VLM
64
73
0
03 Apr 2024
Accurate Block Quantization in LLMs with Outliers
Nikita Trukhanov
I. Soloveychik
MQ
28
4
0
29 Mar 2024
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs
Guoqiang Chen
Xiuwei Shang
Shaoyin Cheng
Yanming Zhang
Weiming Zhang
Neng H. Yu
N. Yu
94
2
0
27 Mar 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Jen-tse Huang
E. Li
Man Ho Lam
Tian Liang
Wenxuan Wang
Youliang Yuan
Wenxiang Jiao
Xing Wang
Zhaopeng Tu
Michael R. Lyu
ELM
LLMAG
88
33
0
18 Mar 2024
DAM: Dynamic Adapter Merging for Continual Video QA Learning
Feng Cheng
Ziyang Wang
Yi-Lin Sung
Yan-Bo Lin
Mohit Bansal
Gedas Bertasius
CLL
MoMe
39
10
0
13 Mar 2024
Legally Binding but Unfair? Towards Assessing Fairness of Privacy Policies
Vincent Freiberger
Erik Buchmann
AILaw
38
5
0
12 Mar 2024
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
Zhenyu (Allen) Zhang
Runjin Chen
Shiwei Liu
Zhewei Yao
Olatunji Ruwase
Beidi Chen
Xiaoxia Wu
Zhangyang Wang
34
26
0
05 Mar 2024
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
Aly M. Kassem
Omar Mahmoud
Niloofar Mireshghallah
Hyunwoo J. Kim
Yulia Tsvetkov
Yejin Choi
Sherif Saad
Santu Rana
50
19
0
05 Mar 2024
Language Models Represent Beliefs of Self and Others
Wentao Zhu
Zhining Zhang
Yizhou Wang
MILM
LRM
50
7
0
28 Feb 2024
From Large Language Models and Optimization to Decision Optimization CoPilot: A Research Manifesto
Segev Wasserkrug
Léonard Boussioux
D. Hertog
F. Mirzazadeh
Ilker Birbil
Jannis Kurtz
Donato Maragno
LLMAG
46
3
0
26 Feb 2024
GPTVQ: The Blessing of Dimensionality for LLM Quantization
M. V. Baalen
Andrey Kuzmin
Markus Nagel
Peter Couperus
Cédric Bastoul
E. Mahurin
Tijmen Blankevoort
Paul N. Whatmough
MQ
36
28
0
23 Feb 2024
A Critical Evaluation of AI Feedback for Aligning Large Language Models
Archit Sharma
Sedrick Scott Keh
Eric Mitchell
Chelsea Finn
Kushal Arora
Thomas Kollar
ALM
LLMAG
29
23
0
19 Feb 2024
Towards Unified Alignment Between Agents, Humans, and Environment
Zonghan Yang
An Liu
Zijun Liu
Kai Liu
Fangzhou Xiong
...
Zhenhe Zhang
Fuwen Luo
Zhicheng Guo
Peng Li
Yang Liu
32
4
0
12 Feb 2024
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Keisuke Kamahori
Tian Tang
Yile Gu
Kan Zhu
Baris Kasikci
71
20
0
10 Feb 2024
Generalized Preference Optimization: A Unified Approach to Offline Alignment
Yunhao Tang
Z. Guo
Zeyu Zheng
Daniele Calandriello
Rémi Munos
Mark Rowland
Pierre Harvey Richemond
Michal Valko
Bernardo Avila-Pires
Bilal Piot
32
92
0
08 Feb 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
130
109
0
08 Feb 2024
Approximation Rates and VC-Dimension Bounds for (P)ReLU MLP Mixture of Experts
Anastasis Kratsios
Haitz Sáez de Ocáriz Borde
Takashi Furuya
Marc T. Law
MoE
41
1
0
05 Feb 2024
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision
Zihan Wang
Yunxuan Li
Yuexin Wu
Liangchen Luo
Le Hou
Hongkun Yu
Jingbo Shang
LRM
42
20
0
05 Feb 2024
Previous
1
2
3
4
5
Next