ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.16944
  4. Cited By
Zephyr: Direct Distillation of LM Alignment

Zephyr: Direct Distillation of LM Alignment

25 October 2023
Lewis Tunstall
E. Beeching
Nathan Lambert
Nazneen Rajani
Kashif Rasul
Younes Belkada
Shengyi Huang
Leandro von Werra
Clémentine Fourrier
Nathan Habib
Nathan Sarrazin
Omar Sanseviero
Alexander M. Rush
Thomas Wolf
    ALM
ArXivPDFHTML

Papers citing "Zephyr: Direct Distillation of LM Alignment"

50 / 260 papers shown
Title
Token-level Direct Preference Optimization
Token-level Direct Preference Optimization
Yongcheng Zeng
Guoqing Liu
Weiyu Ma
Ning Yang
Haifeng Zhang
Jun Wang
24
42
0
18 Apr 2024
FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out
  Document
FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out Document
Joonho Yang
Seunghyun Yoon
Byeongjeong Kim
Hwanhee Lee
HILM
31
3
0
17 Apr 2024
Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of
  Language Models with Fine-grained Rewards
Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards
Hyeonbin Hwang
Doyoung Kim
Seungone Kim
Seonghyeon Ye
Minjoon Seo
LRM
ReLM
40
7
0
16 Apr 2024
Learn Your Reference Model for Real Good Alignment
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski
Boris Shaposhnikov
Alexey Malakhov
Nikita Surnachev
Yaroslav Aksenov
Ian Maksimov
Nikita Balagansky
Daniil Gavrilov
OffRL
54
26
0
15 Apr 2024
Discourse-Aware In-Context Learning for Temporal Expression
  Normalization
Discourse-Aware In-Context Learning for Temporal Expression Normalization
Akash Kumar Gautam
Lukas Lange
Jannik Strötgen
34
0
0
11 Apr 2024
NoticIA: A Clickbait Article Summarization Dataset in Spanish
NoticIA: A Clickbait Article Summarization Dataset in Spanish
Iker García-Ferrero
Begoña Altuna
39
2
0
11 Apr 2024
The Hallucinations Leaderboard -- An Open Effort to Measure
  Hallucinations in Large Language Models
The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models
Giwon Hong
Aryo Pradipta Gema
Rohit Saxena
Xiaotang Du
Ping Nie
...
Laura Perez-Beltrachini
Max Ryabinin
Xuanli He
Clémentine Fourrier
Pasquale Minervini
LRM
HILM
38
11
0
08 Apr 2024
SambaLingo: Teaching Large Language Models New Languages
SambaLingo: Teaching Large Language Models New Languages
Zoltan Csaki
Bo Li
Jonathan Li
Qiantong Xu
Pian Pawakapan
Leon Zhang
Yun Du
Hengyu Zhao
Changran Hu
Urmish Thakker
37
6
0
08 Apr 2024
ALERT: A Comprehensive Benchmark for Assessing Large Language Models'
  Safety through Red Teaming
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming
Simone Tedeschi
Felix Friedrich
P. Schramowski
Kristian Kersting
Roberto Navigli
Huu Nguyen
Bo Li
ELM
41
45
0
06 Apr 2024
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Xinrun Du
Zhouliang Yu
Songyang Gao
Ding Pan
Yuyang Cheng
...
Tianyu Zheng
Xinchen Luo
Guorui Zhou
Wenhu Chen
Ge Zhang
48
17
0
05 Apr 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with
  General Preferences
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Corby Rosset
Ching-An Cheng
Arindam Mitra
Michael Santacroce
Ahmed Hassan Awadallah
Tengyang Xie
152
114
0
04 Apr 2024
Conifer: Improving Complex Constrained Instruction-Following Ability of
  Large Language Models
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models
Haoran Sun
Lixin Liu
Junjie Li
Fengyu Wang
Baohua Dong
Ran Lin
Ruohui Huang
27
14
0
03 Apr 2024
CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models
CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models
Xuechen Liang
Meiling Tao
Yinghui Xia
Yiting Xie
Jun Wang
JingSong Yang
LLMAG
33
12
0
02 Apr 2024
Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment
Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment
Yuu Jinnai
Tetsuro Morimura
Kaito Ariu
Kenshi Abe
69
3
0
01 Apr 2024
Query Performance Prediction using Relevance Judgments Generated by
  Large Language Models
Query Performance Prediction using Relevance Judgments Generated by Large Language Models
Chuan Meng
Negar Arabzadeh
Arian Askari
Mohammad Aliannejadi
Maarten de Rijke
LRM
37
11
0
01 Apr 2024
"Sorry, Come Again?" Prompting -- Enhancing Comprehension and
  Diminishing Hallucination with [PAUSE]-injected Optimal Paraphrasing
"Sorry, Come Again?" Prompting -- Enhancing Comprehension and Diminishing Hallucination with [PAUSE]-injected Optimal Paraphrasing
Vipula Rawte
Islam Tonmoy
M. M. Zaman
Prachi Priya
Marcin Kardas
Alan Schelten
Ruan Silva
LRM
28
1
0
27 Mar 2024
FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image
  Editing
FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing
Trong-Tung Nguyen
Duc A. Nguyen
Anh Tran
Cuong Pham
DiffM
38
7
0
27 Mar 2024
Semantic Ranking for Automated Adversarial Technique Annotation in
  Security Text
Semantic Ranking for Automated Adversarial Technique Annotation in Security Text
Udesh Kumarasinghe
Ahmed Lekssays
H. Sencar
Sabri Boughorbel
Charitha Elvitigala
Preslav Nakov
24
6
0
25 Mar 2024
RewardBench: Evaluating Reward Models for Language Modeling
RewardBench: Evaluating Reward Models for Language Modeling
Nathan Lambert
Valentina Pyatkin
Jacob Morrison
Lester James Validad Miranda
Bill Yuchen Lin
...
Sachin Kumar
Tom Zick
Yejin Choi
Noah A. Smith
Hanna Hajishirzi
ALM
85
214
0
20 Mar 2024
Enhancing Trust in Autonomous Agents: An Architecture for Accountability
  and Explainability through Blockchain and Large Language Models
Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models
Laura Fernández-Becerra
Miguel Ángel González Santamarta
Ángel Manuel Guerrero Higueras
Francisco J. Rodríguez-Lera
Vicente Matellán Olivera
36
0
0
14 Mar 2024
CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language
  Models to Coding Preferences
CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences
Martin Weyssow
Aton Kamanda
H. Sahraoui
ALM
64
32
0
14 Mar 2024
ORPO: Monolithic Preference Optimization without Reference Model
ORPO: Monolithic Preference Optimization without Reference Model
Jiwoo Hong
Noah Lee
James Thorne
OSLM
42
209
0
12 Mar 2024
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Egor Zverev
Sahar Abdelnabi
Soroush Tabesh
Mario Fritz
Christoph H. Lampert
56
19
0
11 Mar 2024
Negating Negatives: Alignment without Human Positive Samples via
  Distributional Dispreference Optimization
Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization
Shitong Duan
Xiaoyuan Yi
Peng Zhang
T. Lu
Xing Xie
Ning Gu
40
4
0
06 Mar 2024
PHAnToM: Personality Has An Effect on Theory-of-Mind Reasoning in Large
  Language Models
PHAnToM: Personality Has An Effect on Theory-of-Mind Reasoning in Large Language Models
Fiona Anting Tan
G. Yeo
Fanyou Wu
Weijie Xu
Vinija Jain
Aman Chadha
Kokil Jaidka
Yang Liu
See-Kiong Ng
LRM
33
6
0
04 Mar 2024
To Generate or to Retrieve? On the Effectiveness of Artificial Contexts
  for Medical Open-Domain Question Answering
To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering
Giacomo Frisoni
Alessio Cocchieri
Alex Presepi
Gianluca Moro
Zaiqiao Meng
RALM
MedIm
52
15
0
04 Mar 2024
Fine Tuning vs. Retrieval Augmented Generation for Less Popular
  Knowledge
Fine Tuning vs. Retrieval Augmented Generation for Less Popular Knowledge
Heydar Soudani
Evangelos Kanoulas
Faegheh Hasibi
34
28
0
03 Mar 2024
LM4OPT: Unveiling the Potential of Large Language Models in Formulating
  Mathematical Optimization Problems
LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems
Tasnim Ahmed
Salimur Choudhury
25
11
0
02 Mar 2024
LAB: Large-Scale Alignment for ChatBots
LAB: Large-Scale Alignment for ChatBots
Shivchander Sudalairaj
Abhishek Bhandwaldar
Aldo Pareja
Kai Xu
David D. Cox
Akash Srivastava
OSLM
41
28
0
02 Mar 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional
  Preference Alignment with Multi-Objective Rewards
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
Haoxiang Wang
Yong Lin
Wei Xiong
Rui Yang
Shizhe Diao
Shuang Qiu
Han Zhao
Tong Zhang
40
71
0
28 Feb 2024
Do Large Language Models Mirror Cognitive Language Processing?
Do Large Language Models Mirror Cognitive Language Processing?
Yuqi Ren
Renren Jin
Tongxuan Zhang
Deyi Xiong
50
4
0
28 Feb 2024
Tower: An Open Multilingual Large Language Model for Translation-Related
  Tasks
Tower: An Open Multilingual Large Language Model for Translation-Related Tasks
Duarte M. Alves
José P. Pombal
Nuno M. Guerreiro
Pedro H. Martins
Joao Alves
...
Patrick Fernandes
Sweta Agrawal
Pierre Colombo
José G. C. de Souza
André F.T. Martins
LRM
57
129
0
27 Feb 2024
Linguistic Intelligence in Large Language Models for Telecommunications
Linguistic Intelligence in Large Language Models for Telecommunications
Tasnim Ahmed
Nicola Piovesan
Antonio De Domenico
Salimur Choudhury
39
9
0
24 Feb 2024
Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A
  Case-Study in E-Commerce Opinion Summarization
Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization
Swaroop Nath
Tejpalsingh Siledar
Sankara Sri Raghava Ravindra Muddu
Rupasai Rangaraju
H. Khadilkar
...
Suman Banerjee
Amey Patil
Sudhanshu Singh
M. Chelliah
Nikesh Garera
43
0
0
23 Feb 2024
Break the Breakout: Reinventing LM Defense Against Jailbreak Attacks
  with Self-Refinement
Break the Breakout: Reinventing LM Defense Against Jailbreak Attacks with Self-Refinement
Heegyu Kim
Sehyun Yuk
Hyunsouk Cho
AAML
38
16
0
23 Feb 2024
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form
  Medical Question Answering Applications and Beyond
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Zhiyuan Wang
Jinhao Duan
Chenxi Yuan
Qingyu Chen
Tianlong Chen
Huaxiu Yao
Yue Zhang
Ren Wang
Kaidi Xu
Xiaoshuang Shi
UQLM
30
9
0
22 Feb 2024
Coercing LLMs to do and reveal (almost) anything
Coercing LLMs to do and reveal (almost) anything
Jonas Geiping
Alex Stein
Manli Shu
Khalid Saifullah
Yuxin Wen
Tom Goldstein
AAML
48
43
0
21 Feb 2024
TreeEval: Benchmark-Free Evaluation of Large Language Models through
  Tree Planning
TreeEval: Benchmark-Free Evaluation of Large Language Models through Tree Planning
Xiang Li
Yunshi Lan
Chao Yang
ELM
46
8
0
20 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELM
VLM
44
101
0
20 Feb 2024
AnaloBench: Benchmarking the Identification of Abstract and Long-context
  Analogies
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies
Xiao Ye
Andrew Wang
Jacob Choi
Yining Lu
Shreya Sharma
Lingfeng Shen
Vijay Tiyyala
Nicholas Andrews
Daniel Khashabi
ELM
39
8
0
19 Feb 2024
A Critical Evaluation of AI Feedback for Aligning Large Language Models
A Critical Evaluation of AI Feedback for Aligning Large Language Models
Archit Sharma
Sedrick Scott Keh
Eric Mitchell
Chelsea Finn
Kushal Arora
Thomas Kollar
ALM
LLMAG
21
23
0
19 Feb 2024
Stick to Your Role! Context-dependence and Stability of Personal Value
  Expression in Large Language Models
Stick to Your Role! Context-dependence and Stability of Personal Value Expression in Large Language Models
Grgur Kovač
Rémy Portelas
Masataka Sawayama
Peter Ford Dominey
Pierre-Yves Oudeyer
LLMAG
29
1
0
19 Feb 2024
Self-AMPLIFY: Improving Small Language Models with Self Post Hoc
  Explanations
Self-AMPLIFY: Improving Small Language Models with Self Post Hoc Explanations
Milan Bhan
Jean-Noël Vittaut
N. Chesneau
Marie-Jeanne Lesot
ReLM
LRM
32
3
0
19 Feb 2024
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference
  Dataset and Modular Fine-tuning Schema
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
Junru Lu
Siyu An
Min Zhang
Yulan He
Di Yin
Xing Sun
48
2
0
19 Feb 2024
One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation
One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation
Tejpalsingh Siledar
Swaroop Nath
Sankara Sri Raghava Ravindra Muddu
Rupasai Rangaraju
Swaprava Nath
...
Suman Banerjee
Amey Patil
Sudhanshu Singh
M. Chelliah
Nikesh Garera
ALM
LRM
30
6
0
18 Feb 2024
From Prejudice to Parity: A New Approach to Debiasing Large Language
  Model Word Embeddings
From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings
Aishik Rakshit
Smriti Singh
Shuvam Keshari
Arijit Ghosh Chowdhury
Vinija Jain
Aman Chadha
37
0
0
18 Feb 2024
Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs
Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs
Arian Askari
Roxana Petcu
Chuan Meng
Mohammad Aliannejadi
Amin Abolghasemi
Evangelos Kanoulas
Suzan Verberne
21
9
0
18 Feb 2024
Dissecting Human and LLM Preferences
Dissecting Human and LLM Preferences
Junlong Li
Fan Zhou
Shichao Sun
Yikai Zhang
Hai Zhao
Pengfei Liu
ALM
21
5
0
17 Feb 2024
Multi-modal preference alignment remedies regression of visual
  instruction tuning on language model
Multi-modal preference alignment remedies regression of visual instruction tuning on language model
Shengzhi Li
Rongyu Lin
Shichao Pei
40
20
0
16 Feb 2024
GenRES: Rethinking Evaluation for Generative Relation Extraction in the
  Era of Large Language Models
GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models
Pengcheng Jiang
Jiacheng Lin
Zifeng Wang
Jimeng Sun
Jiawei Han
28
3
0
16 Feb 2024
Previous
123456
Next