ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.02385
  4. Cited By
TinyLlama: An Open-Source Small Language Model

TinyLlama: An Open-Source Small Language Model

4 January 2024
Peiyuan Zhang
Guangtao Zeng
Tianduo Wang
Wei Lu
    ALM
    LRM
ArXivPDFHTML

Papers citing "TinyLlama: An Open-Source Small Language Model"

50 / 266 papers shown
Title
Root Defence Strategies: Ensuring Safety of LLM at the Decoding Level
Root Defence Strategies: Ensuring Safety of LLM at the Decoding Level
Xinyi Zeng
Yuying Shang
Yutao Zhu
Jingyuan Zhang
Yu Tian
AAML
208
2
0
09 Oct 2024
QERA: an Analytical Framework for Quantization Error Reconstruction
QERA: an Analytical Framework for Quantization Error Reconstruction
Cheng Zhang
Jeffrey T. H. Wong
Can Xiao
George A. Constantinides
Yiren Zhao
MQ
47
2
0
08 Oct 2024
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning
  Large Language Models
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models
Zefang Liu
Yinzhu Quan
43
0
0
02 Oct 2024
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter
  Merging
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter Merging
Yiming Ju
Ziyi Ni
Xingrun Xing
Zhixiong Zeng
hanyu Zhao
Siqi Fan
Zheng Zhang
MoMe
41
2
0
01 Oct 2024
Fisher Information-based Efficient Curriculum Federated Learning with
  Large Language Models
Fisher Information-based Efficient Curriculum Federated Learning with Large Language Models
Ji Liu
Jiaxiang Ren
Ruoming Jin
Zijie Zhang
Yang Zhou
P. Valduriez
Dejing Dou
FedML
44
3
0
30 Sep 2024
Do Influence Functions Work on Large Language Models?
Do Influence Functions Work on Large Language Models?
Zhe Li
Wei Zhao
Yige Li
Jun Sun
TDI
36
1
0
30 Sep 2024
CoT-ST: Enhancing LLM-based Speech Translation with Multimodal
  Chain-of-Thought
CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought
Yexing Du
Ziyang Ma
Yifan Yang
Keqi Deng
Xie Chen
Bo Yang
Yang Xiang
Ming Liu
Bing Qin
LRM
26
6
0
29 Sep 2024
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models
Hui-Po Wang
Mario Fritz
37
3
0
26 Sep 2024
MonoFormer: One Transformer for Both Diffusion and Autoregression
MonoFormer: One Transformer for Both Diffusion and Autoregression
Chuyang Zhao
Yuxing Song
Wenhao Wang
Haocheng Feng
Errui Ding
Yifan Sun
Xinyan Xiao
Jingdong Wang
DiffM
39
18
0
24 Sep 2024
EuroLLM: Multilingual Language Models for Europe
EuroLLM: Multilingual Language Models for Europe
Pedro Henrique Martins
Patrick Fernandes
Joao Alves
Nuno M. Guerreiro
Ricardo Rei
...
Pierre Colombo
Barry Haddow
José G. C. de Souza
Alexandra Birch
André F. T. Martins
37
20
0
24 Sep 2024
Benchmarking Edge AI Platforms for High-Performance ML Inference
Benchmarking Edge AI Platforms for High-Performance ML Inference
Rakshith Jayanth
Neelesh Gupta
Viktor Prasanna
BDL
36
1
0
23 Sep 2024
EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language
  Models
EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models
Hossein Rajabzadeh
A. Jafari
Aman Sharma
Benyamin Jami
Hyock Ju Kwon
Ali Ghodsi
Boxing Chen
Mehdi Rezagholizadeh
35
0
0
22 Sep 2024
QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option Shuffling
QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option Shuffling
Blessed Guda
Gabrial Zencha A.
Lawrence Francis
Carlee Joe-Wong
28
1
0
21 Sep 2024
EMMeTT: Efficient Multimodal Machine Translation Training
EMMeTT: Efficient Multimodal Machine Translation Training
Piotr Żelasko
Zhehuai Chen
Mengru Wang
Daniel Galvez
Oleksii Hrinchuk
Shuoyang Ding
Ke Hu
Jagadeesh Balam
Vitaly Lavrukhin
Boris Ginsburg
42
1
0
20 Sep 2024
Exploring Scaling Laws for Local SGD in Large Language Model Training
Exploring Scaling Laws for Local SGD in Large Language Model Training
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
25
4
0
20 Sep 2024
$\textit{SKIntern}$: Internalizing Symbolic Knowledge for Distilling
  Better CoT Capabilities into Small Language Models
SKIntern\textit{SKIntern}SKIntern: Internalizing Symbolic Knowledge for Distilling Better CoT Capabilities into Small Language Models
Huanxuan Liao
Shizhu He
Yupu Hao
Xiang Li
Yuanzhe Zhang
Kang Liu
Jun Zhao
LRM
44
0
0
20 Sep 2024
Enhancing Knowledge Distillation of Large Language Models through
  Efficient Multi-Modal Distribution Alignment
Enhancing Knowledge Distillation of Large Language Models through Efficient Multi-Modal Distribution Alignment
Tianyu Peng
Jiajun Zhang
37
2
0
19 Sep 2024
Exploring and Enhancing the Transfer of Distribution in Knowledge
  Distillation for Autoregressive Language Models
Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models
Jun Rao
Xuebo Liu
Zepeng Lin
Liang Ding
Jing Li
Dacheng Tao
Min Zhang
44
2
0
19 Sep 2024
Large Language Models are Strong Audio-Visual Speech Recognition Learners
Large Language Models are Strong Audio-Visual Speech Recognition Learners
Umberto Cappellazzo
Minsu Kim
Honglie Chen
Pingchuan Ma
Stavros Petridis
Daniele Falavigna
Alessio Brutti
Maja Pantic
36
9
0
18 Sep 2024
Improving Multi-candidate Speculative Decoding
Improving Multi-candidate Speculative Decoding
Xiaofan Lu
Yixiao Zeng
Feiyang Ma
Zixu Yu
Marco Levorato
34
0
0
16 Sep 2024
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset
  Comparison
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison
Judy Hanwen Shen
Archit Sharma
Jun Qin
50
4
0
15 Sep 2024
Optimizing Ingredient Substitution Using Large Language Models to
  Enhance Phytochemical Content in Recipes
Optimizing Ingredient Substitution Using Large Language Models to Enhance Phytochemical Content in Recipes
Luis Rita
Josh Southern
I. Laponogov
Kyle Higgins
Kirill Veselkov
49
1
0
13 Sep 2024
Towards Fairer Health Recommendations: finding informative unbiased
  samples via Word Sense Disambiguation
Towards Fairer Health Recommendations: finding informative unbiased samples via Word Sense Disambiguation
Gavin Butts
Pegah Emdad
Jethro Lee
Shannon Song
Chiman Salavati
Willmar Sosa Diaz
Shiri Dori-Hacohen
Fabricio Murai
FaML
23
0
0
11 Sep 2024
LoCa: Logit Calibration for Knowledge Distillation
LoCa: Logit Calibration for Knowledge Distillation
Runming Yang
Taiqiang Wu
Yujiu Yang
46
1
0
07 Sep 2024
TinyAgent: Function Calling at the Edge
TinyAgent: Function Calling at the Edge
Lutfi Eren Erdogan
Nicholas Lee
Siddharth Jha
Sehoon Kim
Ryan Tabrizi
Suhong Moon
Coleman Hooper
Gopala Anumanchipalli
Kurt Keutzer
Amir Gholami
LLMAG
43
12
0
01 Sep 2024
InkubaLM: A small language model for low-resource African languages
InkubaLM: A small language model for low-resource African languages
A. Tonja
Bonaventure F. P. Dossou
Jessica Ojo
Jenalea Rajab
Fadel Thior
...
Anuoluwapo Aremu
Pelonomi Moiloa
Jade Z. Abbott
Vukosi Marivate
Benjamin Rosman
46
9
0
30 Aug 2024
On-Device Language Models: A Comprehensive Review
On-Device Language Models: A Comprehensive Review
Jiajun Xu
Zhiyuan Li
Wei Chen
Qun Wang
Xin Gao
Qi Cai
Ziyuan Ling
52
30
0
26 Aug 2024
Selective Preference Optimization via Token-Level Reward Function
  Estimation
Selective Preference Optimization via Token-Level Reward Function Estimation
Kailai Yang
Zhiwei Liu
Qianqian Xie
Jimin Huang
Erxue Min
Sophia Ananiadou
35
10
0
24 Aug 2024
Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion
  for Efficient Inference Intervention in Large Language Model
Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model
Chenhan Yuan
Fei Huang
Ru Peng
Keming Lu
Bowen Yu
Chang Zhou
Jingren Zhou
KELM
47
0
0
20 Aug 2024
Edge-Cloud Collaborative Motion Planning for Autonomous Driving with
  Large Language Models
Edge-Cloud Collaborative Motion Planning for Autonomous Driving with Large Language Models
Jiao Chen
Suyan Dai
Fangfang Chen
Zuohong Lv
Jianhua Tang
42
6
0
19 Aug 2024
The advantages of context specific language models: the case of the Erasmian Language Model
The advantages of context specific language models: the case of the Erasmian Language Model
João Gonçalves
Nick Jelicic
Michele Murgia
Evert Stamhuis
36
0
0
13 Aug 2024
Deeploy: Enabling Energy-Efficient Deployment of Small Language Models
  On Heterogeneous Microcontrollers
Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers
Moritz Scherer
Luka Macan
Victor J. B. Jung
Philip Wiese
Luca Bompani
Luca Bompani
Francesco Conti
Luca Benini
MoE
52
10
0
08 Aug 2024
Designing Efficient LLM Accelerators for Edge Devices
Designing Efficient LLM Accelerators for Edge Devices
Jude Haris
Rappy Saha
Wenhao Hu
José Cano
29
7
0
01 Aug 2024
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
Yupeng Chen
Senmiao Wang
Zhihang Lin
Zhihang Lin
Yushun Zhang
Tian Ding
Ruoyu Sun
Ruoyu Sun
CLL
83
3
0
30 Jul 2024
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal
  Domain
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain
Pierre Colombo
T. Pires
Malik Boudiaf
Rui Melo
Dominic Culver
Sofia Morgado
Etienne Malaboeuf
Gabriel Hautreux
Johanne Charpentier
Michael Desa
ELM
AILaw
ALM
47
12
0
28 Jul 2024
Towards Effective and Efficient Continual Pre-training of Large Language
  Models
Towards Effective and Efficient Continual Pre-training of Large Language Models
Jie Chen
Zhipeng Chen
Jiapeng Wang
Kun Zhou
Yutao Zhu
...
Rui Yan
Zhewei Wei
Di Hu
Wenbing Huang
Ji-Rong Wen
KELM
ALM
CLL
ELM
LRM
151
4
0
26 Jul 2024
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models
Haoyu Tang
Ye Liu
Xukai Liu
Xukai Liu
Yanghai Zhang
Kai Zhang
Xiaofang Zhou
Enhong Chen
MU
75
3
0
25 Jul 2024
Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning
Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning
Yeongbin Seo
Dongha Lee
Jinyoung Yeo
CLL
KELM
100
1
0
24 Jul 2024
LLaST: Improved End-to-end Speech Translation System Leveraged by Large
  Language Models
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
Xi Chen
Songyang Zhang
Qibing Bai
Kai-xiang Chen
Satoshi Nakamura
AuLLM
37
6
0
22 Jul 2024
Watermark Smoothing Attacks against Language Models
Watermark Smoothing Attacks against Language Models
Hongyan Chang
Hamed Hassani
Reza Shokri
WaLM
67
3
0
19 Jul 2024
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Chaofan Tao
Qian Liu
Longxu Dou
Niklas Muennighoff
Zhongwei Wan
Ping Luo
Min Lin
Ngai Wong
PILM
60
47
0
18 Jul 2024
PFPs: Prompt-guided Flexible Pathological Segmentation for Diverse
  Potential Outcomes Using Large Vision and Language Models
PFPs: Prompt-guided Flexible Pathological Segmentation for Diverse Potential Outcomes Using Large Vision and Language Models
Can Cui
Ruining Deng
Junlin Guo
Quan Liu
Tianyuan Yao
Haichun Yang
Yuankai Huo
VLM
MedIm
34
1
0
13 Jul 2024
H2O-Danube3 Technical Report
H2O-Danube3 Technical Report
Pascal Pfeiffer
Philipp Singer
Yauhen Babakhin
Gabor Fodor
Nischay Dhankhar
Sri Satish Ambati
19
3
0
12 Jul 2024
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive
  Distillation
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
Liqun Ma
Mingjie Sun
Zhiqiang Shen
31
7
0
09 Jul 2024
Statistical investigations into the geometry and homology of random
  programs
Statistical investigations into the geometry and homology of random programs
Jon Sporring
Ken Friis Larsen
23
0
0
05 Jul 2024
LoCo: Low-Bit Communication Adaptor for Large-scale Model Training
LoCo: Low-Bit Communication Adaptor for Large-scale Model Training
Xingyu Xie
Zhijie Lin
Kim-Chuan Toh
Pan Zhou
40
2
0
05 Jul 2024
LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content
  Moderation of Large Language Models
LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models
Hayder Elesedy
Pedro M. Esperança
Silviu Vlad Oprea
Mete Ozay
KELM
36
2
0
03 Jul 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min Lin
MoE
74
41
1
01 Jul 2024
BERGEN: A Benchmarking Library for Retrieval-Augmented Generation
BERGEN: A Benchmarking Library for Retrieval-Augmented Generation
David Rau
Hervé Déjean
Nadezhda Chirkova
Thibault Formal
Shuai Wang
Vassilina Nikoulina
S. Clinchant
47
12
0
01 Jul 2024
FoldGPT: Simple and Effective Large Language Model Compression Scheme
FoldGPT: Simple and Effective Large Language Model Compression Scheme
Songwei Liu
Chao Zeng
Lianqiang Li
Chenqian Yan
Lean Fu
Xing Mei
Fangmin Chen
48
4
0
01 Jul 2024
Previous
123456
Next