ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.04088
  4. Cited By
Mixtral of Experts

Mixtral of Experts

8 January 2024
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
Chris Bamford
Devendra Singh Chaplot
Diego de Las Casas
Emma Bou Hanna
Florian Bressand
Gianna Lengyel
Guillaume Bour
Guillaume Lample
Lélio Renard Lavaud
Lucile Saulnier
Marie-Anne Lachaux
Pierre Stock
Sandeep Subramanian
Sophia Yang
Szymon Antoniak
Teven Le Scao
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
    MoE
    LLMAG
ArXivPDFHTML

Papers citing "Mixtral of Experts"

50 / 208 papers shown
Title
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
16
0
06 Oct 2024
FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering
FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering
Siqiao Xue
Tingting Chen
Fan Zhou
Qingyang Dai
Zhixuan Chu
Hongyuan Mei
41
4
0
06 Oct 2024
Audio-Agent: Leveraging LLMs For Audio Generation, Editing and Composition
Audio-Agent: Leveraging LLMs For Audio Generation, Editing and Composition
Zixuan Wang
Chi-Keung Tang
Chi-Keung Tang
DiffM
VGen
LLMAG
49
4
0
04 Oct 2024
Enriching Ontologies with Disjointness Axioms using Large Language
  Models
Enriching Ontologies with Disjointness Axioms using Large Language Models
Elias Crum
Antonio De Santis
Manon Ovide
Jiaxin Pan
Alessia Pisu
Nicolas Lazzari
Sebastian Rudolph
28
0
0
04 Oct 2024
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Aiwei Liu
Sheng Guan
Yong-Jin Liu
L. Pan
Yifei Zhang
Liancheng Fang
Lijie Wen
Philip S. Yu
Xuming Hu
WaLM
158
2
0
04 Oct 2024
No Need to Talk: Asynchronous Mixture of Language Models
No Need to Talk: Asynchronous Mixture of Language Models
Anastasiia Filippova
Angelos Katharopoulos
David Grangier
Ronan Collobert
MoE
44
0
0
04 Oct 2024
GraphRouter: A Graph-based Router for LLM Selections
GraphRouter: A Graph-based Router for LLM Selections
Tao Feng
Yanzhen Shen
Jiaxuan You
97
10
0
04 Oct 2024
Output Scouting: Auditing Large Language Models for Catastrophic Responses
Output Scouting: Auditing Large Language Models for Catastrophic Responses
Andrew Bell
Joao Fonseca
KELM
53
1
0
04 Oct 2024
Reward-RAG: Enhancing RAG with Reward Driven Supervision
Reward-RAG: Enhancing RAG with Reward Driven Supervision
Thang Nguyen
Peter Chin
Yu-Wing Tai
RALM
42
4
0
03 Oct 2024
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Hanrong Zhang
Jingyuan Huang
Kai Mei
Yifei Yao
Zhenting Wang
Chenlu Zhan
Hongwei Wang
Yongfeng Zhang
AAML
LLMAG
ELM
51
22
0
03 Oct 2024
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Yuxuan Yao
Han Wu
Mingyang Liu
Sichun Luo
Xiongwei Han
Jie Liu
Zhijiang Guo
Linqi Song
58
4
0
03 Oct 2024
House of Cards: Massive Weights in LLMs
House of Cards: Massive Weights in LLMs
Jaehoon Oh
Seungjun Shin
Dokwan Oh
43
1
0
02 Oct 2024
ET-Plan-Bench: Embodied Task-level Planning Benchmark Towards Spatial-Temporal Cognition with Foundation Models
ET-Plan-Bench: Embodied Task-level Planning Benchmark Towards Spatial-Temporal Cognition with Foundation Models
Lingfeng Zhang
Yuening Wang
Hongjian Gu
Atia Hamidizadeh
Zhanguang Zhang
...
Tongtong Cao
Yuzheng Zhuang
Yingxue Zhang
Jianye Hao
Jianye Hao
LM&Ro
46
1
0
02 Oct 2024
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
Tung-Yu Wu
Pei-Yu Lo
ReLM
LRM
46
2
0
02 Oct 2024
Recent Advances in Speech Language Models: A Survey
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
61
14
0
01 Oct 2024
ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities
ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities
Ezra Karger
Houtan Bastani
Chen Yueh-Han
Zachary Jacobs
Danny Halawi
Fred Zhang
P. Tetlock
49
7
0
30 Sep 2024
An Adversarial Perspective on Machine Unlearning for AI Safety
An Adversarial Perspective on Machine Unlearning for AI Safety
Jakub Łucki
Boyi Wei
Yangsibo Huang
Peter Henderson
F. Tramèr
Javier Rando
MU
AAML
73
32
0
26 Sep 2024
CSCE: Boosting LLM Reasoning by Simultaneous Enhancing of Causal Significance and Consistency
CSCE: Boosting LLM Reasoning by Simultaneous Enhancing of Causal Significance and Consistency
Kangsheng Wang
Xiao Zhang
Zizheng Guo
Tianyu Hu
Huimin Ma
LRM
48
7
0
20 Sep 2024
Seek and Solve Reasoning for Table Question Answering
Seek and Solve Reasoning for Table Question Answering
Ruya Jiang
Chun Wang
Weihong Deng
LMTD
ReLM
LRM
43
2
0
09 Sep 2024
Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question
  Answering
Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering
Larissa Pusch
Tim O. F. Conrad
44
0
0
06 Sep 2024
LLM-based multi-agent poetry generation in non-cooperative environments
LLM-based multi-agent poetry generation in non-cooperative environments
Ran Zhang
Steffen Eger
LLMAG
37
5
0
05 Sep 2024
GraphEx: A Graph-based Extraction Method for Advertiser Keyphrase Recommendation
GraphEx: A Graph-based Extraction Method for Advertiser Keyphrase Recommendation
Ashirbad Mishra
Soumik Dey
Marshall Wu
Jinyu Zhao
He Yu
Kaichen Ni
Binbin Li
Kamesh Madduri
57
1
0
05 Sep 2024
NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls
NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls
Kinjal Basu
Ibrahim Abdelaziz
Kelsey Bradford
M. Crouse
Kiran Kate
...
Yara Rizk
Xin Wang
Luis Lastras
Luis A. Lastras
Pavan Kapanipathi
33
7
0
04 Sep 2024
CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models
CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models
Jonathan Bourne
54
4
0
30 Aug 2024
Cost-Effective Hallucination Detection for LLMs
Cost-Effective Hallucination Detection for LLMs
Simon Valentin
Jinmiao Fu
Gianluca Detommaso
Shaoyuan Xu
Giovanni Zappella
Bryan Wang
HILM
42
4
0
31 Jul 2024
Model Agnostic Hybrid Sharding For Heterogeneous Distributed Inference
Model Agnostic Hybrid Sharding For Heterogeneous Distributed Inference
Claudio Angione
Yue Zhao
Harry Yang
Ahmad Farhan
Fielding Johnston
James Buban
Patrick Colangelo
42
1
0
29 Jul 2024
Mixture of Modular Experts: Distilling Knowledge from a Multilingual
  Teacher into Specialized Modular Language Models
Mixture of Modular Experts: Distilling Knowledge from a Multilingual Teacher into Specialized Modular Language Models
Mohammed Al-Maamari
Mehdi Ben Amor
Michael Granitzer
KELM
MoE
38
0
0
28 Jul 2024
Effective Large Language Model Debugging with Best-first Tree Search
Effective Large Language Model Debugging with Best-first Tree Search
Jialin Song
Jonathan Raiman
Bryan Catanzaro
LRM
51
0
0
26 Jul 2024
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined
  Speculation
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
Branden Butler
Sixing Yu
Arya Mazaheri
Ali Jannesari
LRM
46
6
0
16 Jul 2024
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Zilong Wang
Zifeng Wang
Long Le
Huaixiu Steven Zheng
Swaroop Mishra
...
Anush Mattapalli
Ankur Taly
Jingbo Shang
Chen-Yu Lee
Tomas Pfister
RALM
85
33
0
11 Jul 2024
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer
  Architectures and Cross-dataset Stem Augmentation
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
Sungkyun Chang
Emmanouil Benetos
Holger Kirchhoff
Simon Dixon
37
3
0
05 Jul 2024
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language
  Models
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
Yuzhe Gu
Ziwei Ji
Wenwei Zhang
Chengqi Lyu
Dahua Lin
Kai Chen
HILM
42
5
0
05 Jul 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Di Zhang
Xi Li
MoE
56
2
0
28 Jun 2024
RouteLLM: Learning to Route LLMs with Preference Data
RouteLLM: Learning to Route LLMs with Preference Data
Isaac Ong
Amjad Almahairi
Vincent Wu
Wei-Lin Chiang
Tianhao Wu
Joseph E. Gonzalez
M. W. Kadous
Ion Stoica
81
73
0
26 Jun 2024
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
Jikai Wang
Yi Su
Juntao Li
Qingrong Xia
Zi Ye
Xinyu Duan
Zhefeng Wang
Min Zhang
46
12
0
25 Jun 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
62
14
0
24 Jun 2024
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models
Rishabh Maheshwary
Vikas Yadav
Hoang Nguyen
Khyati Mahajan
Sathwik Tejaswi Madhusudhan
44
3
0
24 Jun 2024
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Deyuan Liu
Zhanyue Qin
Han Wang
Zhao Yang
Zecheng Wang
...
Zhao Lv
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
29
2
0
24 Jun 2024
SEAM: A Stochastic Benchmark for Multi-Document Tasks
SEAM: A Stochastic Benchmark for Multi-Document Tasks
Gili Lior
Avi Caciularu
Arie Cattan
Shahar Levy
Ori Shapira
Gabriel Stanovsky
RALM
40
4
0
23 Jun 2024
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
Lynn Chua
Badih Ghazi
Yangsibo Huang
Pritish Kamath
Ravi Kumar
Pasin Manurangsi
Amer Sinha
Chulin Xie
Chiyuan Zhang
66
1
0
23 Jun 2024
Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative
  Models
Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models
Sanjay Vishwakarma
Francis Harkins
Siddharth Golecha
Vishal Sharathchandra Bajpe
Nicolas Dupuis
Luca Buratti
David Kremer
Ismael Faro
Ruchir Puri
Juan Cruz-Benito
ELM
50
3
0
20 Jun 2024
ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts
ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts
Samar Khanna
Medhanie Irgau
David B. Lobell
Stefano Ermon
VLM
32
4
0
16 Jun 2024
VideoGUI: A Benchmark for GUI Automation from Instructional Videos
VideoGUI: A Benchmark for GUI Automation from Instructional Videos
Kevin Qinghong Lin
Linjie Li
Difei Gao
Qinchen Wu
Mingyi Yan
Zhengyuan Yang
Lijuan Wang
Mike Zheng Shou
46
10
0
14 Jun 2024
BABILong: Testing the Limits of LLMs with Long Context
  Reasoning-in-a-Haystack
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Ivan Rodkin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALM
ALM
LRM
ReLM
ELM
51
61
0
14 Jun 2024
A Survey on Large Language Models from General Purpose to Medical
  Applications: Datasets, Methodologies, and Evaluations
A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang
Huansheng Ning
Yi Peng
Qikai Wei
Daniel Tesfai
Wenwei Mao
Tao Zhu
Runhe Huang
LM&MA
AI4MH
ELM
44
5
0
14 Jun 2024
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Xiaoshuai Song
Muxi Diao
Guanting Dong
Zhengyang Wang
Yujia Fu
...
Yejie Wang
Zhuoma Gongque
Jianing Yu
Qiuna Tan
Weiran Xu
ELM
55
11
0
12 Jun 2024
Merging Improves Self-Critique Against Jailbreak Attacks
Merging Improves Self-Critique Against Jailbreak Attacks
Victor Gallego
AAML
MoMe
44
3
0
11 Jun 2024
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
Lu Li
Tianze Zhang
Zhiqi Bu
Suyuchen Wang
Huan He
Jie Fu
Yonghui Wu
Jiang Bian
Yong Chen
Yoshua Bengio
FedML
MoMe
100
3
0
11 Jun 2024
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Asmar Nadeem
Faegheh Sardari
R. Dawes
Syed Sameed Husain
Adrian Hilton
Armin Mustafa
57
4
0
10 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELM
ALM
LM&MA
105
31
0
09 Jun 2024
Previous
12345
Next