LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

27 April 2023

Minghao Wu

Abdul Waheed

Chiyu Zhang

Muhammad Abdul-Mageed

Alham Fikri Aji

ALM

ArXiv PDF HTML

Papers citing "LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions"

22 / 22 papers shown

Title
Defending against Indirect Prompt Injection by Instruction Detection Tongyu Wen Chenglong Wang Xiyuan Yang Haoyu Tang Yueqi Xie Lingjuan Lyu Zhicheng Dou Fangzhao Wu AAML 29 0 0 08 May 2025
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs Nicolas Boizard Kevin El Haddad C´eline Hudelot Pierre Colombo 68 14 0 28 Jan 2025
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey Guanqiao Qu Qiyuan Chen Wei Wei Zheng Lin Xianhao Chen Kaibin Huang 42 43 0 09 Jul 2024
Improve Student's Reasoning Generalizability through Cascading Decomposed CoTs Distillation Chengwei Dai Kun Li Wei Zhou Song Hu LRM 46 3 0 30 May 2024
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts Minghao Wu Jiahao Xu Yulin Yuan Gholamreza Haffari Longyue Wang Weihua Luo Kaifu Zhang LLMAG 119 22 0 20 May 2024
Parameter Efficient Diverse Paraphrase Generation Using Sequence-Level Knowledge Distillation Lasal Jayawardena Prasan Yapa BDL 34 1 0 19 Apr 2024
On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models Xinpeng Wang Shitong Duan Xiaoyuan Yi Jing Yao Shanlin Zhou Zhihua Wei Peng Zhang Dongkuan Xu Maosong Sun Xing Xie OffRL 38 16 0 07 Mar 2024
LLM-Assisted Content Conditional Debiasing for Fair Text Embedding Wenlong Deng Blair Chen Beidi Zhao Chiyu Zhang Xiaoxiao Li Christos Thrampoulidis 33 0 0 22 Feb 2024
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning Ming Li Lichang Chen Jiuhai Chen Shwai He Jiuxiang Gu Tianyi Zhou 21 50 0 15 Feb 2024
Demystifying Instruction Mixing for Fine-tuning Large Language Models Renxi Wang Haonan Li Minghao Wu Yuxia Wang Xudong Han Chiyu Zhang Timothy Baldwin 28 0 0 17 Dec 2023
Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models Arnav Chavan Nahush Lele Deepak Gupta 19 1 0 12 Dec 2023
The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages Chiyu Zhang Khai Duy Doan Qisheng Liao Muhammad Abdul-Mageed 34 6 0 23 Oct 2023
InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems Willy Chung Samuel Cahyawijaya Bryan Wilie Holy Lovenia Pascale Fung 17 5 0 13 Oct 2023
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback Viet Dac Lai Chien Van Nguyen Nghia Trung Ngo Thuat Nguyen Franck Dernoncourt Ryan A. Rossi Thien Huu Nguyen ALM 40 128 0 29 Jul 2023
Solving NLP Problems through Human-System Collaboration: A Discussion-based Approach Masahiro Kaneko Graham Neubig Naoaki Okazaki 33 6 0 19 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation Xiaowei Huang Wenjie Ruan Wei Huang Gao Jin Yizhen Dong ... Sihao Wu Peipei Xu Dengyu Wu André Freitas Mustafa A. Mustafa ALM 32 81 0 19 May 2023
Fine-tuned Language Models are Continual Learners Thomas Scialom Tuhin Chakrabarty Smaranda Muresan CLL LRM 145 116 0 24 May 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 311 11,915 0 04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization Victor Sanh Albert Webson Colin Raffel Stephen H. Bach Lintang Sutawika ... T. Bers Stella Biderman Leo Gao Thomas Wolf Alexander M. Rush LRM 213 1,656 0 15 Oct 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 253 1,986 0 31 Dec 2020
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 228 4,460 0 23 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 297 6,956 0 20 Apr 2018