ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.02385
  4. Cited By
TinyLlama: An Open-Source Small Language Model
v1v2 (latest)

TinyLlama: An Open-Source Small Language Model

4 January 2024
Peiyuan Zhang
Guangtao Zeng
Tianduo Wang
Wei Lu
    ALMLRM
ArXiv (abs)PDFHTMLGithub (8509★)

Papers citing "TinyLlama: An Open-Source Small Language Model"

50 / 287 papers shown
Title
MoD: A Distribution-Based Approach for Merging Large Language Models
MoD: A Distribution-Based Approach for Merging Large Language Models
Quy-Anh Dang
Chris Ngo
MoMeVLM
77
0
0
01 Nov 2024
MESS+: Energy-Optimal Inferencing in Language Model Zoos with Service
  Level Guarantees
MESS+: Energy-Optimal Inferencing in Language Model Zoos with Service Level Guarantees
Ryan Zhang
Herbert Woisetschläger
Shiqiang Wang
Hans-Arno Jacobsen
31
0
0
31 Oct 2024
Multilingual Pretraining Using a Large Corpus Machine-Translated from a
  Single Source Language
Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language
Jiayi Wang
Yao Lu
Maurice Weber
Max Ryabinin
Yihong Chen
Raphael Tang
Pontus Stenetorp
LRM
104
1
0
31 Oct 2024
Mobility-LLM: Learning Visiting Intentions and Travel Preferences from
  Human Mobility Data with Large Language Models
Mobility-LLM: Learning Visiting Intentions and Travel Preferences from Human Mobility Data with Large Language Models
Letian Gong
Yan Lin
Xinyue Zhang
Yiwen Lu
Xuedi Han
Yichen Liu
Shengnan Guo
Youfang Lin
Huaiyu Wan
96
8
0
29 Oct 2024
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Sangmin Bae
Adam Fisch
Hrayr Harutyunyan
Ziwei Ji
Seungyeon Kim
Tal Schuster
KELM
137
7
0
28 Oct 2024
Transferable Post-training via Inverse Value Learning
Transferable Post-training via Inverse Value Learning
Xinyu Lu
Xueru Wen
Yaojie Lu
Bowen Yu
Hongyu Lin
Haiyang Yu
Le Sun
Jia Zheng
Yongbin Li
42
1
0
28 Oct 2024
Computational Bottlenecks of Training Small-scale Large Language Models
Computational Bottlenecks of Training Small-scale Large Language Models
Saleh Ashkboos
Iman Mirzadeh
Keivan Alizadeh
Mohammad Hossein Sekhavat
Moin Nabi
Mehrdad Farajtabar
Fartash Faghri
61
1
0
25 Oct 2024
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and
  Evaluation
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
74
4
0
24 Oct 2024
Scaling up Masked Diffusion Models on Text
Scaling up Masked Diffusion Models on Text
Shen Nie
Fengqi Zhu
Chao Du
Tianyu Pang
Qian Liu
Guangtao Zeng
Min Lin
Chongxuan Li
AI4CE
215
30
0
24 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
218
7
0
22 Oct 2024
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference
You Wu
Haoyi Wu
Kewei Tu
81
3
0
18 Oct 2024
BenTo: Benchmark Task Reduction with In-Context Transferability
BenTo: Benchmark Task Reduction with In-Context Transferability
Hongyu Zhao
Ming Li
Lichao Sun
Tianyi Zhou
98
0
0
17 Oct 2024
Learning from Imperfect Data: Towards Efficient Knowledge Distillation
  of Autoregressive Language Models for Text-to-SQL
Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL
Qihuang Zhong
Kunfeng Chen
Liang Ding
Juhua Liu
Di Lin
Dacheng Tao
56
1
0
15 Oct 2024
Spatio-Temporal Control for Masked Motion Synthesis
Spatio-Temporal Control for Masked Motion Synthesis
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
Chong Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
92
7
0
14 Oct 2024
Reverse Modeling in Large Language Models
Reverse Modeling in Large Language Models
S. Yu
Yuanchen Xu
Cunxiao Du
Yanying Zhou
Minghui Qiu
Q. Sun
Hao Zhang
Jiawei Wu
162
2
0
13 Oct 2024
CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order
  Reasoning On Device
CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order Reasoning On Device
Yicheng Fu
R. Anantha
Jianpeng Cheng
LRMLLMAG
90
4
0
12 Oct 2024
Generation with Dynamic Vocabulary
Generation with Dynamic Vocabulary
Yanting Liu
Tao Ji
Changzhi Sun
Yuanbin Wu
Xiaoling Wang
79
1
0
11 Oct 2024
KV Prediction for Improved Time to First Token
KV Prediction for Improved Time to First Token
Maxwell Horton
Qingqing Cao
Chenfan Sun
Yanzi Jin
Sachin Mehta
Mohammad Rastegari
Moin Nabi
AI4TS
85
4
0
10 Oct 2024
VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based
  Verifiers
VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers
Jianing Qi
Hao Tang
Zhigang Zhu
OffRLLRM
60
5
0
10 Oct 2024
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation
  Experts
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
115
6
0
09 Oct 2024
Exploring the Readiness of Prominent Small Language Models for the
  Democratization of Financial Literacy
Exploring the Readiness of Prominent Small Language Models for the Democratization of Financial Literacy
Tagore Rao Kosireddy
Jeffrey D. Wall
Evan Lucas
53
1
0
09 Oct 2024
Personal Intelligence System UniLM: Hybrid On-Device Small Language
  Model and Server-Based Large Language Model for Malay Nusantara
Personal Intelligence System UniLM: Hybrid On-Device Small Language Model and Server-Based Large Language Model for Malay Nusantara
Azree Nazri
Olalekan Agbolade
Faisal Aziz
65
0
0
09 Oct 2024
Root Defence Strategies: Ensuring Safety of LLM at the Decoding Level
Root Defence Strategies: Ensuring Safety of LLM at the Decoding Level
Xinyi Zeng
Yuying Shang
Yutao Zhu
Jingyuan Zhang
Yu Tian
AAML
493
4
0
09 Oct 2024
QERA: an Analytical Framework for Quantization Error Reconstruction
QERA: an Analytical Framework for Quantization Error Reconstruction
Cheng Zhang
Jeffrey T. H. Wong
Can Xiao
George A. Constantinides
Yiren Zhao
MQ
78
4
0
08 Oct 2024
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Wei Wu
Chao Wang
L. Chen
Mingze Yin
Yiheng Zhu
Kun Fu
Jieping Ye
Hui Xiong
Zheng Wang
143
1
0
04 Oct 2024
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models
Zefang Liu
Yinzhu Quan
84
2
0
02 Oct 2024
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter
  Merging
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter Merging
Yiming Ju
Ziyi Ni
Xingrun Xing
Zhixiong Zeng
hanyu Zhao
Siqi Fan
Zheng Zhang
MoMe
67
2
0
01 Oct 2024
Fisher Information-based Efficient Curriculum Federated Learning with
  Large Language Models
Fisher Information-based Efficient Curriculum Federated Learning with Large Language Models
Ji Liu
Jiaxiang Ren
Ruoming Jin
Zijie Zhang
Yang Zhou
P. Valduriez
Dejing Dou
FedML
91
6
0
30 Sep 2024
Do Influence Functions Work on Large Language Models?
Do Influence Functions Work on Large Language Models?
Zhe Li
Wei Zhao
Yige Li
Jun Sun
TDI
92
3
0
30 Sep 2024
Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning
Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning
Yexing Du
Youcheng Pan
Ziyang Ma
Keqi Deng
Yifan Yang
Keqi Deng
Xie Chen
Yang Xiang
Ming Liu
Bing Qin
LRM
151
9
0
29 Sep 2024
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models
Hui-Po Wang
Mario Fritz
115
4
0
26 Sep 2024
MonoFormer: One Transformer for Both Diffusion and Autoregression
MonoFormer: One Transformer for Both Diffusion and Autoregression
Chuyang Zhao
Yuxing Song
Wenhao Wang
Haocheng Feng
Errui Ding
Yifan Sun
Xinyan Xiao
Jingdong Wang
DiffM
77
22
0
24 Sep 2024
EuroLLM: Multilingual Language Models for Europe
EuroLLM: Multilingual Language Models for Europe
Pedro Henrique Martins
Patrick Fernandes
Joao Alves
Nuno M. Guerreiro
Ricardo Rei
...
Pierre Colombo
Barry Haddow
José G. C. de Souza
Alexandra Birch
André F. T. Martins
88
40
0
24 Sep 2024
Benchmarking Edge AI Platforms for High-Performance ML Inference
Benchmarking Edge AI Platforms for High-Performance ML Inference
Rakshith Jayanth
Neelesh Gupta
Viktor Prasanna
BDL
56
1
0
23 Sep 2024
EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language
  Models
EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models
Hossein Rajabzadeh
A. Jafari
Aman Sharma
Benyamin Jami
Hyock Ju Kwon
Ali Ghodsi
Boxing Chen
Mehdi Rezagholizadeh
63
0
0
22 Sep 2024
QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option Shuffling
QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option Shuffling
Blessed Guda
Gabrial Zencha A.
Lawrence Francis
Carlee Joe-Wong
97
1
0
21 Sep 2024
EMMeTT: Efficient Multimodal Machine Translation Training
EMMeTT: Efficient Multimodal Machine Translation Training
Piotr Żelasko
Zhehuai Chen
Mengru Wang
Daniel Galvez
Oleksii Hrinchuk
Shuoyang Ding
Ke Hu
Jagadeesh Balam
Vitaly Lavrukhin
Boris Ginsburg
83
1
0
20 Sep 2024
Exploring Scaling Laws for Local SGD in Large Language Model Training
Exploring Scaling Laws for Local SGD in Large Language Model Training
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
92
4
0
20 Sep 2024
$\textit{SKIntern}$: Internalizing Symbolic Knowledge for Distilling
  Better CoT Capabilities into Small Language Models
SKIntern\textit{SKIntern}SKIntern: Internalizing Symbolic Knowledge for Distilling Better CoT Capabilities into Small Language Models
Huanxuan Liao
Shizhu He
Yupu Hao
Xiang Li
Yuanzhe Zhang
Kang Liu
Jun Zhao
LRM
81
0
0
20 Sep 2024
Enhancing Knowledge Distillation of Large Language Models through
  Efficient Multi-Modal Distribution Alignment
Enhancing Knowledge Distillation of Large Language Models through Efficient Multi-Modal Distribution Alignment
Tianyu Peng
Jiajun Zhang
60
3
0
19 Sep 2024
Exploring and Enhancing the Transfer of Distribution in Knowledge
  Distillation for Autoregressive Language Models
Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models
Jun Rao
Xuebo Liu
Zepeng Lin
Liang Ding
Jing Li
Dacheng Tao
Min Zhang
99
2
0
19 Sep 2024
Large Language Models are Strong Audio-Visual Speech Recognition Learners
Large Language Models are Strong Audio-Visual Speech Recognition Learners
Umberto Cappellazzo
Minsu Kim
Honglie Chen
Pingchuan Ma
Stavros Petridis
Daniele Falavigna
Alessio Brutti
Maja Pantic
114
12
0
18 Sep 2024
Improving Multi-candidate Speculative Decoding
Improving Multi-candidate Speculative Decoding
Xiaofan Lu
Yixiao Zeng
Feiyang Ma
Zixu Yu
Marco Levorato
57
1
0
16 Sep 2024
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset
  Comparison
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison
Judy Hanwen Shen
Archit Sharma
Jun Qin
70
5
0
15 Sep 2024
Optimizing Ingredient Substitution Using Large Language Models to
  Enhance Phytochemical Content in Recipes
Optimizing Ingredient Substitution Using Large Language Models to Enhance Phytochemical Content in Recipes
Luis Rita
Josh Southern
I. Laponogov
Kyle Higgins
Kirill Veselkov
64
2
0
13 Sep 2024
Towards Fairer Health Recommendations: finding informative unbiased
  samples via Word Sense Disambiguation
Towards Fairer Health Recommendations: finding informative unbiased samples via Word Sense Disambiguation
Gavin Butts
Pegah Emdad
Jethro Lee
Shannon Song
Chiman Salavati
Willmar Sosa Diaz
Shiri Dori-Hacohen
Fabricio Murai
FaML
61
0
0
11 Sep 2024
LoCa: Logit Calibration for Knowledge Distillation
LoCa: Logit Calibration for Knowledge Distillation
Runming Yang
Taiqiang Wu
Yujiu Yang
83
1
0
07 Sep 2024
TinyAgent: Function Calling at the Edge
TinyAgent: Function Calling at the Edge
Lutfi Eren Erdogan
Nicholas Lee
Siddharth Jha
Sehoon Kim
Ryan Tabrizi
Suhong Moon
Coleman Hooper
Gopala Anumanchipalli
Kurt Keutzer
Amir Gholami
LLMAG
112
13
0
01 Sep 2024
InkubaLM: A small language model for low-resource African languages
InkubaLM: A small language model for low-resource African languages
A. Tonja
Bonaventure F. P. Dossou
Jessica Ojo
Jenalea Rajab
Fadel Thior
...
Anuoluwapo Aremu
Pelonomi Moiloa
Jade Z. Abbott
Vukosi Marivate
Benjamin Rosman
100
11
0
30 Aug 2024
On-Device Language Models: A Comprehensive Review
On-Device Language Models: A Comprehensive Review
Jiajun Xu
Zhiyuan Li
Wei Chen
Qun Wang
Xin Gao
Qi Cai
Ziyuan Ling
140
36
0
26 Aug 2024
Previous
123456
Next