Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.01116
Cited By
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
1 June 2023
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only"
50 / 587 papers shown
Title
FaceOracle: Chat with a Face Image Oracle
Wassim Kabbani
Kiran Raja
Raghavendra Ramachandra
C. Busch
CVBM
50
0
0
13 Jan 2025
Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions
Doaa Mahmud
Hadeel Hajmohamed
Shamma Almentheri
Shamma Alqaydi
Lameya Aldhaheri
R. A. Khalil
Nasir Saeed
AI4TS
43
5
0
08 Jan 2025
HuRef: HUman-REadable Fingerprint for Large Language Models
Boyi Zeng
Cheng Zhou
Yuncong Hu
Yi Xu
Chenghu Zhou
Xinbing Wang
Yu Yu
Zhouhan Lin
52
9
0
08 Jan 2025
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Hadi Pouransari
Chun-Liang Li
Jen-Hao Rick Chang
Pavan Kumar Anasosalu Vasu
Cem Koc
Vaishaal Shankar
Oncel Tuzel
42
8
0
08 Jan 2025
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo
Qingfeng Sun
Can Xu
Pu Zhao
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
OSLM
LRM
110
412
0
03 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
96
12
0
31 Dec 2024
System-2 Mathematical Reasoning via Enriched Instruction Tuning
Huanqia Cai
Yijun Yang
Zhifeng Li
LRM
81
0
0
22 Dec 2024
HalluCana: Fixing LLM Hallucination with A Canary Lookahead
Tianyi Li
Erenay Dayanik
Shubhi Tyagi
Andrea Pierleoni
HILM
80
0
0
10 Dec 2024
Label-Confidence-Aware Uncertainty Estimation in Natural Language Generation
Qinhong Lin
Linna Zhou
Zhongliang Yang
Yuang Cai
HILM
82
0
0
10 Dec 2024
Smoothie: Label Free Language Model Routing
Neel Guha
Mayee F. Chen
Trevor Chow
Ishan S. Khare
Christopher Ré
71
4
0
06 Dec 2024
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
182
0
0
01 Dec 2024
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning
Ruben Ohana
Michael McCabe
Lucas Meyer
Rudy Morel
Fruzsina J. Agocs
...
François Rozet
Liam Parker
M. Cranmer
S. Ho
Shirley Ho
PINN
AI4CE
74
8
1
30 Nov 2024
Curriculum Demonstration Selection for In-Context Learning
Duc Anh Vu
Nguyen Tran Cong Duy
Xiaobao Wu
Hoang Minh Nhat
Du Mingzhe
Nguyen Thanh Thong
Anh Tuan Luu
74
0
0
27 Nov 2024
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Xu Ouyang
Tao Ge
Thomas Hartvigsen
Zhisong Zhang
Haitao Mi
Dong Yu
MQ
95
4
0
26 Nov 2024
FineWeb-zhtw: Scalable Curation of Traditional Chinese Text Data from the Web
Cheng-Wei Lin
Wan-Hsuan Hsieh
Kai-Xin Guan
Chan-Jan Hsu
Chia-Chen Kuo
Chuan-Lin Lai
Chung-Wei Chung
Ming-Jen Wang
Da-shan Shiu
64
1
0
25 Nov 2024
Predicting Emergent Capabilities by Finetuning
Charlie Snell
Eric Wallace
Dan Klein
Sergey Levine
ELM
LRM
87
5
0
25 Nov 2024
UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages
Bethel Melesse Tessema
Akhil Kedia
Tae-Sun Chung
77
0
0
21 Nov 2024
Training Bilingual LMs with Data Constraints in the Targeted Language
Skyler Seto
Maartje ter Hoeve
He Bai
Natalie Schluter
David Grangier
86
0
0
20 Nov 2024
Bi-Mamba: Towards Accurate 1-Bit State Space Models
Shengkun Tang
Liqun Ma
Yiming Li
Mingjie Sun
Zhiqiang Shen
Mamba
78
3
0
18 Nov 2024
Zyda-2: a 5 Trillion Token High-Quality Dataset
Yury Tokpanov
Paolo Glorioso
Quentin Anthony
Beren Millidge
44
3
0
09 Nov 2024
Crystal: Illuminating LLM Abilities on Language and Code
Tianhua Tao
Junbo Li
Bowen Tan
Hongyi Wang
William Marshall
...
Joel Hestness
Natalia Vassilieva
Zhiqiang Shen
Eric P. Xing
Zhengzhong Liu
47
4
0
06 Nov 2024
Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language
Jiayi Wang
Yao Lu
Maurice Weber
Max Ryabinin
Yihong Chen
Raphael Tang
Pontus Stenetorp
LRM
47
1
0
31 Oct 2024
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Zhiqi Bu
Xiaomeng Jin
Bhanukiran Vinzamuri
Anil Ramakrishna
Kai-Wei Chang
V. Cevher
Mingyi Hong
MU
91
7
0
29 Oct 2024
Exploring Forgetting in Large Language Model Pre-Training
Chonghua Liao
Ruobing Xie
Xingchen Sun
Haowen Sun
Zhanhui Kang
CLL
41
0
0
22 Oct 2024
Redefining Proactivity for Information Seeking Dialogue
Jing Yang Lee
Seokhwan Kim
Kartik Mehta
Jiun-Yu Kao
Yu-Hsiang Lin
Arpit Gupta
30
0
0
20 Oct 2024
From Test-Taking to Test-Making: Examining LLM Authoring of Commonsense Assessment Items
Melissa Roemmele
Andrew S. Gordon
35
1
0
18 Oct 2024
What's New in My Data? Novelty Exploration via Contrastive Generation
Masaru Isonuma
Ivan Titov
31
0
0
18 Oct 2024
Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Shailaja Keyur Sampat
Maitreya Patel
Yezhou Yang
Chitta Baral
26
0
0
17 Oct 2024
Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching
Jie Peng
Zhang Cao
Huaizhi Qu
Zhengyu Zhang
Chang Guo
Yanyong Zhang
Zhichao Cao
Tianlong Chen
39
2
0
17 Oct 2024
Understanding the Role of LLMs in Multimodal Evaluation Benchmarks
Botian Jiang
Lei Li
Xiaonan Li
Zhaowei Li
Xiachong Feng
Lingpeng Kong
Qiang Liu
Xipeng Qiu
41
2
0
16 Oct 2024
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
41
15
0
15 Oct 2024
Reverse Modeling in Large Language Models
S. Yu
Yuanchen Xu
Cunxiao Du
Yanying Zhou
Minghui Qiu
Q. Sun
Hao Zhang
Jiawei Wu
39
2
0
13 Oct 2024
Parameter-Efficient Fine-Tuning of Large Language Models using Semantic Knowledge Tuning
Nusrat Jahan Prottasha
Asif Mahmud
Md. Shohanur Islam Sobuj
Prakash Bhat
Md. Kowsher
Niloofar Yousefi
O. Garibay
35
4
0
11 Oct 2024
Data Processing for the OpenGPT-X Model Family
Nicolo' Brandizzi
Hammam Abdelwahab
Anirban Bhowmick
Lennard Helmer
Benny Jörg Stein
...
Georg Rehm
Dennis Wegener
Nicolas Flores-Herr
Joachim Kohler
Johannes Leveling
VLM
84
2
0
11 Oct 2024
KV Prediction for Improved Time to First Token
Maxwell Horton
Qingqing Cao
Chenfan Sun
Yanzi Jin
Sachin Mehta
Mohammad Rastegari
Moin Nabi
AI4TS
39
1
0
10 Oct 2024
Self-Boosting Large Language Models with Synthetic Preference Data
Qingxiu Dong
Li Dong
Xingxing Zhang
Zhifang Sui
Furu Wei
SyDa
48
7
0
09 Oct 2024
The Mystery of Compositional Generalization in Graph-based Generative Commonsense Reasoning
Xiyan Fu
Anette Frank
LRM
33
0
0
08 Oct 2024
Training-free LLM-generated Text Detection by Mining Token Probability Sequences
Yihuai Xu
Yongwei Wang
Yifei Bi
Huangsen Cao
Zhouhan Lin
Yu Zhao
Fei Wu
DeLMO
28
0
0
08 Oct 2024
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with Language Models
Ranchi Zhao
Zhen Leng Thai
Yifan Zhang
Shengding Hu
Yunqi Ba
Jie Zhou
Jie Cai
Zhiyuan Liu
Maosong Sun
41
1
0
08 Oct 2024
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
Fei Wang
Ninareh Mehrabi
Palash Goyal
Rahul Gupta
Kai-Wei Chang
Aram Galstyan
ALM
45
1
0
07 Oct 2024
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Jingwei Zuo
Maksim Velikanov
Dhia Eddine Rhaiem
Ilyas Chahed
Younes Belkada
Guillaume Kunsch
Hakim Hacid
ALM
52
14
0
07 Oct 2024
Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes
Kosuke Nishida
Kyosuke Nishida
Kuniko Saito
36
1
0
07 Oct 2024
Rule-based Data Selection for Large Language Models
Xiaomin Li
Mingye Gao
Zhiwei Zhang
Chang Yue
Hong Hu
42
5
0
07 Oct 2024
Inner-Probe: Discovering Copyright-related Data Generation in LLM Architecture
Qichao Ma
Rui-Jie Zhu
Peiye Liu
Renye Yan
Fahong Zhang
...
Meng Li
Zhaofei Yu
Zongwei Wang
Yimao Cai
Tiejun Huang
52
0
0
06 Oct 2024
Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback
Kyuyoung Kim
Ah Jeong Seo
Hao Liu
Jinwoo Shin
Kimin Lee
30
2
0
04 Oct 2024
On The Adaptation of Unlimiformer for Decoder-Only Transformers
Kian Ahrabian
Alon Benhaim
Barun Patra
Jay Pujara
Saksham Singhal
Xia Song
46
0
0
02 Oct 2024
Are LLMs Aware that Some Questions are not Open-ended?
Dongjie Yang
Hai Zhao
27
1
0
01 Oct 2024
Scaling Optimal LR Across Token Horizons
Johan Bjorck
Alon Benhaim
Vishrav Chaudhary
Furu Wei
Xia Song
54
4
0
30 Sep 2024
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
David Grangier
Simin Fan
Skyler Seto
Pierre Ablin
44
3
0
30 Sep 2024
Exploring Language Model Generalization in Low-Resource Extractive QA
Saptarshi Sengupta
Wenpeng Yin
Preslav Nakov
Shreya Ghosh
Suhang Wang
27
0
0
27 Sep 2024
Previous
1
2
3
4
5
...
10
11
12
Next