ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.02385
  4. Cited By
TinyLlama: An Open-Source Small Language Model
v1v2 (latest)

TinyLlama: An Open-Source Small Language Model

4 January 2024
Peiyuan Zhang
Guangtao Zeng
Tianduo Wang
Wei Lu
    ALMLRM
ArXiv (abs)PDFHTMLGithub (8509★)

Papers citing "TinyLlama: An Open-Source Small Language Model"

50 / 287 papers shown
Title
No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language Models
No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language Models
Charaka Vinayak Kumar
Ashok Urlana
Gopichand Kanumolu
B. Garlapati
Pruthwik Mishra
ELM
103
1
0
15 Mar 2025
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Nir Ailon
Akhiad Bercovich
Omri Weinstein
139
0
0
15 Mar 2025
A Survey on Federated Fine-tuning of Large Language Models
A Survey on Federated Fine-tuning of Large Language Models
Yebo Wu
Chunlin Tian
Jingguang Li
He Sun
Kahou Tam
Zhanting Zhou
Haicheng Liao
Zhijiang Guo
Li Li
Chengzhong Xu
FedML
156
5
0
15 Mar 2025
G-Boost: Boosting Private SLMs with General LLMs
Yijiang Fan
Yuren Mao
Longbin Lai
Ying Zhang
Zhengping Qian
Yunjun Gao
70
0
0
13 Mar 2025
Privacy-Preserved Automated Scoring using Federated Learning for Educational Research
Privacy-Preserved Automated Scoring using Federated Learning for Educational Research
Ehsan Latif
Xiaoming Zhai
96
0
0
12 Mar 2025
MoFE: Mixture of Frozen Experts Architecture
Jean Seo
Jaeyoon Kim
Hyopil Shin
MoE
503
0
0
09 Mar 2025
HalluCounter: Reference-free LLM Hallucination Detection in the Wild!
HalluCounter: Reference-free LLM Hallucination Detection in the Wild!
Ashok Urlana
Gopichand Kanumolu
Charaka Vinayak Kumar
B. Garlapati
Rahul Mishra
HILM
126
0
0
06 Mar 2025
Targeted Distillation for Sentiment Analysis
Yice Zhang
Guangyu Xie
Jingjie Lin
Jianzhu Bao
Qianlong Wang
Xi Zeng
Ruifeng Xu
89
0
0
05 Mar 2025
FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference
Hongchao Du
Shangyu Wu
Arina Kharlamova
Nan Guan
Chun Jason Xue
97
1
0
04 Mar 2025
FANformer: Improving Large Language Models Through Effective Periodicity Modeling
FANformer: Improving Large Language Models Through Effective Periodicity Modeling
Yihong Dong
Ge Li
Xue Jiang
Yongding Tao
Kechi Zhang
...
Huanyu Liu
Jiazheng Ding
Jia Li
Jinliang Deng
Hong Mei
AI4TS
142
0
0
28 Feb 2025
Mixtera: A Data Plane for Foundation Model Training
Mixtera: A Data Plane for Foundation Model Training
Maximilian Böther
Xiaozhe Yao
Tolga Kerimoglu
Ana Klimovic
Viktor Gsteiger
Ana Klimovic
MoE
212
0
0
27 Feb 2025
Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning
Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning
Jaehyeon Son
Soochan Lee
Gunhee Kim
OffRL
133
4
0
26 Feb 2025
ANPMI: Assessing the True Comprehension Capabilities of LLMs for Multiple Choice Questions
ANPMI: Assessing the True Comprehension Capabilities of LLMs for Multiple Choice Questions
Gyeongje Cho
Yeonkyoung So
Jaejin Lee
ELM
126
0
0
26 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
154
2
0
26 Feb 2025
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings
Layba Fiaz
Munief Hassan Tahir
Sana Shams
Sarmad Hussain
95
0
0
24 Feb 2025
Revealing and Mitigating Over-Attention in Knowledge Editing
Revealing and Mitigating Over-Attention in Knowledge Editing
Pinzheng Wang
Zecheng Tang
Keyan Zhou
Junlin Li
Qiaoming Zhu
Hao Fei
KELM
179
3
0
21 Feb 2025
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
Anton Razzhigaev
Matvey Mikhalchuk
Temurbek Rahmatullaev
Elizaveta Goncharova
Polina Druzhinina
Ivan Oseledets
Andrey Kuznetsov
125
5
0
20 Feb 2025
EvoP: Robust LLM Inference via Evolutionary Pruning
EvoP: Robust LLM Inference via Evolutionary Pruning
Shangyu Wu
Hongchao Du
Ying Xiong
Shuai Chen
Tei-Wei Kuo
Nan Guan
Chun Jason Xue
102
1
0
19 Feb 2025
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
Zhiyuan Liu
Yanchen Luo
Han Huang
Enzhi Zhang
Changhao Nai
Sihang Li
Yaorui Shi
Xiang Wang
Kenji Kawaguchi
Tat-Seng Chua
201
4
0
18 Feb 2025
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
Cristian Gutierrez
LRM
269
0
0
17 Feb 2025
Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
Fan Zhou
Zengzhi Wang
Qian Liu
Junlong Li
Pengfei Liu
ALM
224
15
0
17 Feb 2025
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
Yixin Ou
Yunzhi Yao
N. Zhang
Hui Jin
Jiacheng Sun
Shumin Deng
Zechao Li
Ningyu Zhang
KELMCLL
128
2
0
16 Feb 2025
Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences
Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences
Shanshan Han
Salman Avestimehr
Chaoyang He
123
2
0
12 Feb 2025
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
Siddharth Singh
Prajwal Singhania
Aditya K. Ranjan
John Kirchenbauer
Jonas Geiping
...
Abhimanyu Hans
Manli Shu
Aditya Tomar
Tom Goldstein
A. Bhatele
180
3
0
12 Feb 2025
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Ziyi Wang
Muneeza Azmart
Ang Li
R. Horesh
Mikhail Yurochkin
220
2
0
11 Feb 2025
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
Xingrun Xing
Zheng Liu
Shitao Xiao
Boyan Gao
Yiming Liang
Wanpeng Zhang
Haokun Lin
Guoqi Li
Jiajun Zhang
LRM
276
2
0
10 Feb 2025
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM
Qingshui Gu
Shu Li
Tianyu Zheng
Zhaoxiang Zhang
517
0
0
10 Feb 2025
Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding
Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding
Sukmin Cho
S. Choi
T. Hwang
Jeongyeon Seo
Soyeong Jeong
Huije Lee
Hoyun Song
Jong C. Park
Youngjin Kwon
102
1
0
08 Feb 2025
Nearly Lossless Adaptive Bit Switching
Nearly Lossless Adaptive Bit Switching
Haiduo Huang
Zhenhua Liu
Tian Xia
Wenzhe zhao
Pengju Ren
MQ
103
0
0
03 Feb 2025
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
Yuhang Zhou
Giannis Karamanolakis
Victor Soto
Anna Rumshisky
Mayank Kulkarni
Furong Huang
Wei Ai
Jianhua Lu
MoMe
214
3
0
03 Feb 2025
Vision-centric Token Compression in Large Language Model
Vision-centric Token Compression in Large Language Model
Ling Xing
Alex Jinpeng Wang
Rui Yan
Xiangbo Shu
Jinhui Tang
VLM
159
0
0
02 Feb 2025
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Borui Xu
Yao Chen
Zeyi Wen
Weiguo Liu
Bingsheng He
192
2
0
02 Feb 2025
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Makoto Shing
Kou Misaki
Han Bao
Sho Yokoi
Takuya Akiba
VLM
130
4
0
28 Jan 2025
Irrational Complex Rotations Empower Low-bit Optimizers
Irrational Complex Rotations Empower Low-bit Optimizers
Zhen Tian
Wayne Xin Zhao
Ji-Rong Wen
MQ
73
0
0
22 Jan 2025
On the uncertainty principle of neural networks
On the uncertainty principle of neural networks
Jun-Jie Zhang
Dong-xiao Zhang
Jian-Nan Chen
L. Pang
Deyu Meng
147
3
0
17 Jan 2025
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework
Yushen Lin
Ruichen Zhang
Wenqi Huang
Kaidi Wang
Z. Ding
Daniel K. C. So
Dusit Niyato
106
3
0
17 Jan 2025
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
466
0
0
30 Dec 2024
Deploying Foundation Model Powered Agent Services: A Survey
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Yining Qi
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
181
2
0
18 Dec 2024
Learning to Reason via Self-Iterative Process Feedback for Small
  Language Models
Learning to Reason via Self-Iterative Process Feedback for Small Language Models
Kaiyuan Chen
Jin Wang
Xuejie Zhang
LRMReLM
115
2
0
11 Dec 2024
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision
  Language Models
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models
Byung-Kwan Lee
Ryo Hachiuma
Yu-Chiang Frank Wang
Y. Ro
Yueh-Hua Wu
VLM
147
1
0
02 Dec 2024
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile
  Vision-Language Model
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model
Qianhan Feng
Wenshuo Li
Tong Lin
Xinghao Chen
VLM
122
1
0
02 Dec 2024
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Zichun Liao
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VGen
180
9
0
02 Dec 2024
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
Duo Wu
Jiangming Wang
Yuan Meng
Yanning Zhang
Le Sun
Zhi Wang
534
0
0
25 Nov 2024
Bi-Mamba: Towards Accurate 1-Bit State Space Models
Shengkun Tang
Liqun Ma
Haoyang Li
Mingjie Sun
Zhiqiang Shen
Mamba
127
3
0
18 Nov 2024
LLäMmlein: Transparent, Compact and Competitive German-Only Language Models from Scratch
Jan Pfister
Julia Wunderle
Andreas Hotho
133
0
0
17 Nov 2024
SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models
  using Soft-Thresholding Mechanism
SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism
Priyansh Bhatnagar
Linfeng Wen
Mingu Kang
41
0
0
15 Nov 2024
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for
  Scalable Training
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training
Philip Zmushko
Aleksandr Beznosikov
Martin Takáč
Samuel Horváth
78
2
0
12 Nov 2024
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
Runming Yang
Taiqiang Wu
Jiahao Wang
Pengfei Hu
Ngai Wong
Yujiu Yang
Yujiu Yang
446
1
0
11 Nov 2024
Privacy Risks of Speculative Decoding in Large Language Models
Privacy Risks of Speculative Decoding in Large Language Models
Jiankun Wei
Abdulrahman Abdulrazzag
Tianchen Zhang
Adel Muursepp
Gururaj Saileshwar
100
2
0
01 Nov 2024
Improving Few-Shot Cross-Domain Named Entity Recognition by Instruction
  Tuning a Word-Embedding based Retrieval Augmented Large Language Model
Improving Few-Shot Cross-Domain Named Entity Recognition by Instruction Tuning a Word-Embedding based Retrieval Augmented Large Language Model
Subhadip Nandi
Neeraj Agrawal
65
0
0
01 Nov 2024
Previous
123456
Next