Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.16640
Cited By
v1
v2
v3 (latest)
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese
30 January 2024
N. Corrêa
Sophia Falk
Shiza Fatimah
Aniket Sen
N. D. Oliveira
Re-assign community
ArXiv (abs)
PDF
HTML
Github (36★)
Papers citing
"TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese"
37 / 37 papers shown
Title
TinyLlama: An Open-Source Small Language Model
Peiyuan Zhang
Guangtao Zeng
Tianduo Wang
Wei Lu
ALM
LRM
145
406
0
04 Jan 2024
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models
H. Ko
Kichang Yang
Minho Ryu
Taekyoon Choi
Seungmu Yang
Jiwung Hyun
Sung-Yong Park
Kyubyong Park
83
30
0
04 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
95
576
0
01 Jun 2023
Bactrian-X: Multilingual Replicable Instruction-Following Models with Low-Rank Adaptation
Haonan Li
Fajri Koto
Minghao Wu
Alham Fikri Aji
Timothy Baldwin
ALM
62
75
0
24 May 2023
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
Fuzhao Xue
Yao Fu
Wangchunshu Zhou
Zangwei Zheng
Yang You
118
84
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
227
607
0
22 May 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
397
2,392
0
09 Nov 2022
Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model
A. Luccioni
S. Viguier
Anne-Laure Ligozat
99
286
0
03 Nov 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
194
3,146
0
20 Oct 2022
mGPT: Few-Shot Learners Go Multilingual
Oleh Shliazhko
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Anastasia Kozlova
Tatiana Shavrina
87
154
0
15 Apr 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
178
835
0
14 Apr 2022
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
208
1,976
0
29 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
883
13,148
0
04 Mar 2022
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
118
789
0
01 Dec 2021
8-bit Optimizers via Block-wise Quantization
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
117
302
0
06 Oct 2021
Datasets: A Community Library for Natural Language Processing
Quentin Lhoest
Albert Villanova del Moral
Yacine Jernite
A. Thakur
Patrick von Platen
...
Thibault Goehringer
Victor Mustar
François Lagunas
Alexander M. Rush
Thomas Wolf
218
614
0
07 Sep 2021
MarIA: Spanish Language Models
Asier Gutiérrez-Fandiño
Jordi Armengol-Estapé
Marc Pàmies
Joan Llop-Palao
Joaquín Silveira-Ocampo
C. Carrino
Aitor Gonzalez-Agirre
Carme Armentano-Oller
Carlos Rodríguez-Penagos
Marta Villegas
VLM
51
119
0
15 Jul 2021
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
360
634
0
14 Jul 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
284
2,521
0
20 Apr 2021
IndT5: A Text-to-Text Transformer for 10 Indigenous Languages
El Moatez Billah Nagoudi
Wei-Rui Chen
Muhammad Abdul-Mageed
H. Cavusoglu
63
24
0
04 Apr 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
455
2,120
0
31 Dec 2020
AraGPT2: Pre-Trained Transformer for Arabic Language Generation
Wissam Antoun
Fady Baly
Hazem M. Hajj
VLM
53
105
0
31 Dec 2020
GottBERT: a pure German Language Model
Raphael Scheible
Fabian Thomczyk
P. Tippmann
V. Jaravine
M. Boeker
VLM
43
80
0
03 Dec 2020
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
64
231
0
11 Jun 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
608
4,893
0
23 Jan 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
529
42,559
0
03 Dec 2019
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
117
975
0
10 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
228
6,585
0
05 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
95
658
0
01 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
459
20,298
0
23 Oct 2019
Quantifying the Carbon Emissions of Machine Learning
Alexandre Lacoste
A. Luccioni
Victor Schmidt
Thomas Dandres
109
708
0
21 Oct 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
334
1,914
0
17 Sep 2019
MultiFiT: Efficient Multi-lingual Language Model Fine-tuning
Julian Martin Eisenschlos
Sebastian Ruder
Piotr Czapla
Marcin Kardas
Sylvain Gugger
Jeremy Howard
49
99
0
10 Sep 2019
Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning
Lifu Huang
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
AIMat
RALM
LRM
115
457
0
31 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
674
24,541
0
26 Jul 2019
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
Taku Kudo
John Richardson
201
3,526
0
19 Aug 2018
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Noam M. Shazeer
Mitchell Stern
ODL
84
1,051
0
11 Apr 2018
1