Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.00838
Cited By
OLMo: Accelerating the Science of Language Models
1 February 2024
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
Oyvind Tafjord
A. Jha
Hamish Ivison
Ian H. Magnusson
Yizhong Wang
Shane Arora
David Atkinson
Russell Authur
Khyathi Raghavi Chandu
Arman Cohan
Jennifer Dumas
Yanai Elazar
Yuling Gu
Jack Hessel
Tushar Khot
William Merrill
Jacob Morrison
Niklas Muennighoff
Aakanksha Naik
Crystal Nam
Matthew E. Peters
Valentina Pyatkin
Abhilasha Ravichander
Dustin Schwenk
Saurabh Shah
Will Smith
Emma Strubell
Nishant Subramani
Mitchell Wortsman
Pradeep Dasigi
Nathan Lambert
Kyle Richardson
Luke Zettlemoyer
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OLMo: Accelerating the Science of Language Models"
50 / 93 papers shown
Title
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Bill Li
Blake Bordelon
Shane Bergsma
C. Pehlevan
Boris Hanin
Joel Hestness
39
0
0
02 May 2025
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
Chaitali Bhattacharyya
Yeseong Kim
45
0
0
01 May 2025
ICL CIPHERS: Quantifying "Learning'' in In-Context Learning via Substitution Ciphers
Zhouxiang Fang
Aayush Mishra
Muhan Gao
Anqi Liu
Daniel Khashabi
44
0
0
28 Apr 2025
Evaluating Evaluation Metrics -- The Mirage of Hallucination Detection
Atharva Kulkarni
Yuan-kang Zhang
Joel Ruben Antony Moniz
Xiou Ge
Bo-Hsiang Tseng
Dhivya Piraviperumal
S.
Hong-ye Yu
HILM
76
0
0
25 Apr 2025
On Linear Representations and Pretraining Data Frequency in Language Models
Jack Merullo
Noah A. Smith
Sarah Wiegreffe
Yanai Elazar
35
0
0
16 Apr 2025
SEA-LION: Southeast Asian Languages in One Network
Raymond Ng
Thanh Ngan Nguyen
Yuli Huang
Ngee Chia Tai
Wai Yi Leong
...
David Ong Tat-Wee
B. Liu
William-Chandra Tjhi
Erik Cambria
Leslie Teo
36
11
0
08 Apr 2025
Not All Data Are Unlearned Equally
Aravind Krishnan
Siva Reddy
Marius Mosbach
MU
128
0
0
07 Apr 2025
ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
Haoming Xu
Shuxun Wang
Yanqiu Zhao
Yi Zhong
Ziyan Jiang
Ningyuan Zhao
Shumin Deng
H. Chen
N. Zhang
MoMe
MU
67
0
0
27 Mar 2025
Cross-Tokenizer Distillation via Approximate Likelihood Matching
Benjamin Minixhofer
Ivan Vulić
E. Ponti
134
0
0
25 Mar 2025
The Lucie-7B LLM and the Lucie Training Dataset: Open resources for multilingual language generation
Olivier Gouvert
Julie Hunter
Jérôme Louradour
Christophe Cerisara
Evan Dufraisse
Yaya Sy
Laura Rivière
Jean-Pierre Lorré
OpenLLM-France community
141
0
0
15 Mar 2025
Are formal and functional linguistic mechanisms dissociated in language models?
Michael Hanna
Sandro Pezzelle
Yonatan Belinkov
45
0
0
14 Mar 2025
Evaluating and Aligning Human Economic Risk Preferences in LLMs
J. Liu
Yi Yang
K. Tam
62
0
0
09 Mar 2025
Triple Phase Transitions: Understanding the Learning Dynamics of Large Language Models from a Neuroscience Perspective
Yuko Nakagi
Keigo Tada
Sota Yoshino
Shinji Nishimoto
Yu Takagi
LRM
37
0
0
28 Feb 2025
FOReCAst: The Future Outcome Reasoning and Confidence Assessment Benchmark
Zhangdie Yuan
Zifeng Ding
Andreas Vlachos
AI4TS
77
0
0
27 Feb 2025
Shades of Zero: Distinguishing Impossibility from Inconceivability
Jennifer Hu
Felix Sosa
T. Ullman
37
0
0
27 Feb 2025
SR-LLM: Rethinking the Structured Representation in Large Language Model
Jiahuan Zhang
Tianheng Wang
Hanqing Wu
Ziyi Huang
Yulong Wu
Dongbai Chen
Linfeng Song
Yue Zhang
Guozheng Rao
Kaicheng Yu
42
1
0
21 Feb 2025
LUME: LLM Unlearning with Multitask Evaluations
Anil Ramakrishna
Yixin Wan
Xiaomeng Jin
Kai-Wei Chang
Zhiqi Bu
Bhanukiran Vinzamuri
V. Cevher
Mingyi Hong
Rahul Gupta
CLL
MU
101
7
0
20 Feb 2025
Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare
Hiba Ahsan
Arnab Sen Sharma
Silvio Amir
David Bau
Byron C. Wallace
83
0
0
20 Feb 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Longxu Dou
Qian Liu
Fan Zhou
Changyu Chen
Zili Wang
...
Tianyu Pang
Chao Du
Xinyi Wan
Wei Lu
Min Lin
92
1
0
18 Feb 2025
Prediction hubs are context-informed frequent tokens in LLMs
Beatrix M. G. Nielsen
Iuri Macocco
Marco Baroni
126
1
0
17 Feb 2025
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM
Qingshui Gu
Shu Li
Tianyu Zheng
Zhaoxiang Zhang
192
0
0
10 Feb 2025
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
Xingrun Xing
Zheng Liu
Shitao Xiao
Boyan Gao
Yiming Liang
Wanpeng Zhang
Haokun Lin
Guoqi Li
Jiajun Zhang
LRM
56
1
0
10 Feb 2025
Language Models Largely Exhibit Human-like Constituent Ordering Preferences
Ada Defne Tur
Gaurav Kamath
Siva Reddy
57
0
0
08 Feb 2025
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks
Jing Yang
Max Glockner
Anderson de Rezende Rocha
Iryna Gurevych
LRM
65
1
0
07 Feb 2025
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
Yuhang Zhou
Giannis Karamanolakis
Victor Soto
Anna Rumshisky
Mayank Kulkarni
Furong Huang
Wei Ai
Jianhua Lu
MoMe
104
0
0
03 Feb 2025
The Pitfalls of "Security by Obscurity" And What They Mean for Transparent AI
Peter Hall
Olivia Mundahl
Sunoo Park
71
0
0
30 Jan 2025
Baichuan-Omni-1.5 Technical Report
Yadong Li
J. Liu
Tao Zhang
Tao Zhang
S. Chen
...
Jianhua Xu
Haoze Sun
Mingan Lin
Zenan Zhou
Weipeng Chen
AuLLM
72
10
0
28 Jan 2025
LLMs as Repositories of Factual Knowledge: Limitations and Solutions
Seyed Mahed Mousavi
Simone Alghisi
Giuseppe Riccardi
KELM
47
0
0
22 Jan 2025
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories
Raj Sanjay Shah
Sashank Varma
LRM
89
0
0
22 Jan 2025
Human-like conceptual representations emerge from language prediction
Ningyu Xu
Qi Zhang
Chao Du
Qiang Luo
Xipeng Qiu
Xuanjing Huang
Menghan Zhang
70
0
0
21 Jan 2025
Merging Feed-Forward Sublayers for Compressed Transformers
Neha Verma
Kenton W. Murray
Kevin Duh
AI4CE
50
0
0
10 Jan 2025
Scaling Laws for Floating Point Quantization Training
X. Sun
Shuaipeng Li
Ruobing Xie
Weidong Han
Kan Wu
...
Yangyu Tao
Zhanhui Kang
C. Xu
Di Wang
Jie Jiang
MQ
AIFin
60
0
0
05 Jan 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
80
4
0
31 Dec 2024
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
138
0
0
30 Dec 2024
Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings
Carolin M. Schuster
Maria-Alexandra Dinisor
Shashwat Ghatiwala
Georg Groh
75
1
0
25 Nov 2024
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
177
3
0
20 Nov 2024
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Siming Huang
Tianhao Cheng
J.K. Liu
Jiaran Hao
L. Song
...
Ge Zhang
Zili Wang
Yuan Qi
Yinghui Xu
Wei Chu
ALM
75
17
0
07 Nov 2024
Smaller Large Language Models Can Do Moral Self-Correction
Guangliang Liu
Zhiyu Xue
Rongrong Wang
K. Johnson
Kristen Marie Johnson
LRM
23
0
0
30 Oct 2024
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
72
8
0
29 Oct 2024
AAAR-1.0: Assessing AI's Potential to Assist Research
Renze Lou
Hanzi Xu
Sijia Wang
Jiangshu Du
Ryo Kamoi
...
Xi Li
K. Zhang
Congying Xia
Lifu Huang
Wenpeng Yin
35
5
0
29 Oct 2024
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
Haocheng Xi
Han Cai
Ligeng Zhu
Y. Lu
Kurt Keutzer
Jianfei Chen
Song Han
MQ
63
9
0
25 Oct 2024
Cross-lingual Transfer of Reward Models in Multilingual Alignment
Jiwoo Hong
Noah Lee
Rodrigo Martínez-Castaño
César Rodríguez
James Thorne
46
4
0
23 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
65
5
0
22 Oct 2024
Decoding Time Series with LLMs: A Multi-Agent Framework for Cross-Domain Annotation
M. Lin
Z. Chen
Yanchi Liu
Xujiang Zhao
Zongyu Wu
Junxiang Wang
Xiang Zhang
Suhang Wang
Haifeng Chen
AI4TS
28
7
0
22 Oct 2024
Self-calibration for Language Model Quantization and Pruning
Miles Williams
G. Chrysostomou
Nikolaos Aletras
MQ
117
0
0
22 Oct 2024
Scaling Laws for Predicting Downstream Performance in LLMs
Yangyi Chen
Binxuan Huang
Yifan Gao
Zhengyang Wang
Jingfeng Yang
Heng Ji
LRM
43
8
0
11 Oct 2024
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
91
14
0
11 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
William Yang Wang
MIALM
92
3
2
10 Oct 2024
Dissecting Fine-Tuning Unlearning in Large Language Models
Yihuai Hong
Yuelin Zou
Lijie Hu
Ziqian Zeng
Di Wang
Haiqin Yang
AAML
MU
37
2
0
09 Oct 2024
Data Selection via Optimal Control for Language Models
Yuxian Gu
Li Dong
Hongning Wang
Y. Hao
Qingxiu Dong
Furu Wei
Minlie Huang
AI4CE
50
4
0
09 Oct 2024
1
2
Next