ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.17834
  4. Cited By
Stable LM 2 1.6B Technical Report

Stable LM 2 1.6B Technical Report

27 February 2024
Marco Bellagente
J. Tow
Dakota Mahan
Duy Phung
Maksym Zhuravinskyi
Reshinth Adithyan
James Baicoianu
Ben Brooks
Nathan Cooper
Ashish Datta
Meng Lee
Emad Mostaque
Michael Pieler
Nikhil Pinnaparju
Paulo Rocha
Harry Saini
H. Teufel
Niccoló Zanichelli
Carlos Riquelme
    OSLM
ArXivPDFHTML

Papers citing "Stable LM 2 1.6B Technical Report"

45 / 45 papers shown
Title
Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference
Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference
Haolin Zhang
Jeff Huang
32
0
0
09 May 2025
Reasoning Capabilities and Invariability of Large Language Models
Reasoning Capabilities and Invariability of Large Language Models
Alessandro Raganato
Rafael Peñaloza
Marco Viviani
G. Pasi
ReLM
LRM
82
0
0
01 May 2025
Outlier dimensions favor frequent tokens in language models
Outlier dimensions favor frequent tokens in language models
Iuri Macocco
Nora Graichen
Gemma Boleda
Marco Baroni
55
0
0
27 Mar 2025
PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
Cheng Deng
Luoyang Sun
Jiwen Jiang
Yongcheng Zeng
Xinjian Wu
...
Haoyang Li
Lei Chen
Lionel M. Ni
Hongzhi Zhang
Jun Wang
168
0
0
15 Mar 2025
Taming Knowledge Conflicts in Language Models
Gaotang Li
Yuzhong Chen
Hanghang Tong
KELM
49
1
0
14 Mar 2025
RigoChat 2: an adapted language model to Spanish using a bounded dataset and reduced hardware
Gonzalo Santamaría Gómez
Guillem García Subies
Pablo Gutiérrez Ruiz
Mario González Valero
Natàlia Fuertes
...
Nuria Aldama García
David Betancur Sánchez
Kateryna Sushkova
Marta Guerrero Nieto
Á. Jiménez
51
0
0
11 Mar 2025
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
Y. Huang
Peng Ye
Chenyu Huang
Jianjian Cao
Lin Zhang
Baopu Li
Gang Yu
Tao Chen
MoMe
MoE
55
1
0
03 Mar 2025
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Jiaxi Li
Xingxing Zhang
Xun Wang
Xiaolong Huang
Li Dong
Liang Wang
Si-Qing Chen
Wei Lu
Furu Wei
SyDa
166
0
0
23 Feb 2025
Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences
Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences
Shanshan Han
Salman Avestimehr
Chaoyang He
76
0
0
12 Feb 2025
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Makoto Shing
Kou Misaki
Han Bao
Sho Yokoi
Takuya Akiba
VLM
57
1
0
28 Jan 2025
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
147
0
0
30 Dec 2024
Tiny-Align: Bridging Automatic Speech Recognition and Large Language
  Model on the Edge
Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge
Ruiyang Qin
Dancheng Liu
Gelei Xu
Zheyu Yan
Chenhui Xu
Yuting Hu
Xiaolin Hu
Jinjun Xiong
Yiyu Shi
AuLLM
110
1
0
21 Nov 2024
Hymba: A Hybrid-head Architecture for Small Language Models
Hymba: A Hybrid-head Architecture for Small Language Models
Xin Dong
Y. Fu
Shizhe Diao
Wonmin Byeon
Zijia Chen
...
Min-Hung Chen
Yoshi Suhara
Y. Lin
Jan Kautz
Pavlo Molchanov
Mamba
100
21
0
20 Nov 2024
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Chris Liu
Liang Zeng
Jiaheng Liu
Rui Yan
Jujie He
Chaojie Wang
Shuicheng Yan
Yang Liu
Yahui Zhou
AI4TS
48
63
0
24 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
67
5
0
22 Oct 2024
BenTo: Benchmark Task Reduction with In-Context Transferability
BenTo: Benchmark Task Reduction with In-Context Transferability
Hongyu Zhao
Ming Li
Lichao Sun
Tianyi Zhou
33
0
0
17 Oct 2024
Taming Overconfidence in LLMs: Reward Calibration in RLHF
Taming Overconfidence in LLMs: Reward Calibration in RLHF
Jixuan Leng
Chengsong Huang
Banghua Zhu
Jiaxin Huang
34
7
0
13 Oct 2024
PoisonBench: Assessing Large Language Model Vulnerability to Data
  Poisoning
PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning
Tingchen Fu
Mrinank Sharma
Philip H. S. Torr
Shay B. Cohen
David M. Krueger
Fazl Barez
AAML
47
7
0
11 Oct 2024
An evaluation of LLM code generation capabilities through graded
  exercises
An evaluation of LLM code generation capabilities through graded exercises
Álvaro Barbero Jiménez
ELM
31
1
0
06 Oct 2024
Exploring Description-Augmented Dataless Intent Classification
Exploring Description-Augmented Dataless Intent Classification
Ruoyu Hu
Foaad Khosmood
Abbas Edalat
AI4TS
39
0
0
25 Jul 2024
Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
Riccardo Cantini
Giada Cosenza
A. Orsino
Domenico Talia
AAML
57
5
0
11 Jul 2024
Training on the Test Task Confounds Evaluation and Emergence
Training on the Test Task Confounds Evaluation and Emergence
Ricardo Dominguez-Olmedo
Florian E. Dorner
Moritz Hardt
ELM
71
7
1
10 Jul 2024
$\text{Memory}^3$: Language Modeling with Explicit Memory
Memory3\text{Memory}^3Memory3: Language Modeling with Explicit Memory
Hongkang Yang
Zehao Lin
Wenjin Wang
Hao Wu
Zhiyu Li
...
Yu Yu
Kai Chen
Feiyu Xiong
Linpeng Tang
Weinan E
50
11
0
01 Jul 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
60
20
0
27 Jun 2024
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
Roman Vashurin
Ekaterina Fadeeva
Artem Vazhentsev
Akim Tsvigun
Daniil Vasilev
...
Timothy Baldwin
Timothy Baldwin
Maxim Panov
Artem Shelmanov
Artem Shelmanov
HILM
68
8
0
21 Jun 2024
African or European Swallow? Benchmarking Large Vision-Language Models
  for Fine-Grained Object Classification
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification
Gregor Geigle
Radu Timofte
Goran Glavas
31
10
0
20 Jun 2024
TorchOpera: A Compound AI System for LLM Safety
TorchOpera: A Compound AI System for LLM Safety
Shanshan Han
Yuhang Yao
Zijian Hu
Dimitris Stripelis
Zhaozhuo Xu
Chaoyang He
LLMAG
41
0
0
16 Jun 2024
OLMES: A Standard for Language Model Evaluations
OLMES: A Standard for Language Model Evaluations
Yuling Gu
Oyvind Tafjord
Bailey Kuehl
Dany Haddad
Jesse Dodge
Hannaneh Hajishirzi
ELM
40
14
0
12 Jun 2024
Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge
  Devices
Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices
Ruiyang Qin
Dancheng Liu
Zheyu Yan
Zhaoxuan Tan
Zixuan Pan
Zhenge Jia
Meng Jiang
Ahmed Abbasi
Jinjun Xiong
Yiyu Shi
59
10
0
06 Jun 2024
fMRI predictors based on language models of increasing complexity
  recover brain left lateralization
fMRI predictors based on language models of increasing complexity recover brain left lateralization
Laurent Bonnasse-Gahot
Christophe Pallier
52
3
0
28 May 2024
ConStat: Performance-Based Contamination Detection in Large Language
  Models
ConStat: Performance-Based Contamination Detection in Large Language Models
Jasper Dekoninck
Mark Niklas Muller
Martin Vechev
37
6
0
25 May 2024
Mitigating Quantization Errors Due to Activation Spikes in GLU-Based
  LLMs
Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMs
Jaewoo Yang
Hayun Kim
Younghoon Kim
39
12
0
23 May 2024
Dense Connector for MLLMs
Dense Connector for MLLMs
Huanjin Yao
Wenhao Wu
Taojiannan Yang
Yuxin Song
Mengxi Zhang
Haocheng Feng
Yifan Sun
Zhiheng Li
Wanli Ouyang
Jingdong Wang
MLLM
VLM
42
16
0
22 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage
  Pruning
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
35
0
0
09 May 2024
Fishing for Magikarp: Automatically Detecting Under-trained Tokens in
  Large Language Models
Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models
Sander Land
Max Bartolo
28
21
0
08 May 2024
ChuXin: 1.6B Technical Report
ChuXin: 1.6B Technical Report
Xiaomin Zhuang
Yufan Jiang
Qiaozhi He
Zhihua Wu
ALM
43
0
0
08 May 2024
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu-Xiang Wang
46
83
0
22 Apr 2024
Binary Classifier Optimization for Large Language Model Alignment
Binary Classifier Optimization for Large Language Model Alignment
Seungjae Jung
Gunsoo Han
D. W. Nam
Kyoung-Woon On
34
21
0
06 Apr 2024
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Han Zhao
Min Zhang
Wei Zhao
Pengxiang Ding
Siteng Huang
Donglin Wang
Mamba
46
65
0
21 Mar 2024
RewardBench: Evaluating Reward Models for Language Modeling
RewardBench: Evaluating Reward Models for Language Modeling
Nathan Lambert
Valentina Pyatkin
Jacob Morrison
Lester James Validad Miranda
Bill Yuchen Lin
...
Sachin Kumar
Tom Zick
Yejin Choi
Noah A. Smith
Hanna Hajishirzi
ALM
82
214
0
20 Mar 2024
Fine Tuning vs. Retrieval Augmented Generation for Less Popular
  Knowledge
Fine Tuning vs. Retrieval Augmented Generation for Less Popular Knowledge
Heydar Soudani
Evangelos Kanoulas
Faegheh Hasibi
28
28
0
03 Mar 2024
Efficient Parallelization Layouts for Large-Scale Distributed Model
  Training
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
Johannes Hagemann
Samuel Weinbach
Konstantin Dobler
Maximilian Schall
Gerard de Melo
LRM
37
6
0
09 Nov 2023
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
256
1,996
0
31 Dec 2020
Temporal Reasoning on Implicit Events from Distant Supervision
Temporal Reasoning on Implicit Events from Distant Supervision
Ben Zhou
Kyle Richardson
Qiang Ning
Tushar Khot
Ashish Sabharwal
Dan Roth
167
73
0
24 Oct 2020
Efficient Intent Detection with Dual Sentence Encoders
Efficient Intent Detection with Dual Sentence Encoders
I. Casanueva
Tadas Temvcinas
D. Gerz
Matthew Henderson
Ivan Vulić
VLM
180
453
0
10 Mar 2020
1