Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.01116
Cited By
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
1 June 2023
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only"
50 / 587 papers shown
Title
Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers
Xiuying Wei
Skander Moalla
Razvan Pascanu
Çağlar Gülçehre
34
0
0
24 Jun 2024
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
Yifei Gao
Jie Ou
Lei Wang
Yuting Xiao
Zhiyuan Xiang
Ruiting Dai
Jun Cheng
MQ
36
3
0
24 Jun 2024
Efficient Continual Pre-training by Mitigating the Stability Gap
Yiduo Guo
Jie Fu
Huishuai Zhang
Dongyan Zhao
Songlin Yang
30
13
0
21 Jun 2024
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Daixuan Cheng
Yuxian Gu
Shaohan Huang
Junyu Bi
Minlie Huang
Furu Wei
SyDa
65
20
0
20 Jun 2024
SEC-QA: A Systematic Evaluation Corpus for Financial QA
Viet Dac Lai
Michael Krumdick
Charles Lovering
Varshini Reddy
Craig W. Schmidt
Chris Tanner
56
3
0
20 Jun 2024
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models
Jie Chen
Yupeng Zhang
Bingning Wang
Wayne Xin Zhao
Ji-Rong Wen
Weipeng Chen
SyDa
42
4
0
18 Jun 2024
LiLiuM: eBay's Large Language Models for e-commerce
Christian Herold
Michael Kozielski
Leonid Ekimov
Pavel Petrushkov
P. Vandenbussche
Shahram Khadivi
43
1
0
17 Jun 2024
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
Anas Awadalla
Le Xue
Oscar Lo
Manli Shu
Hannah Lee
...
Silvio Savarese
Caiming Xiong
Ran Xu
Yejin Choi
Ludwig Schmidt
69
25
0
17 Jun 2024
How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation
Dawulie Jinensibieke
M. Maimaiti
Wentao Xiao
Yuanhang Zheng
Xiaobo Wang
54
2
0
17 Jun 2024
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training
David Brandfonbrener
Hanlin Zhang
Andreas Kirsch
Jonathan Richard Schwarz
Sham Kakade
28
7
0
15 Jun 2024
Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments
Zhenrui Yue
Huimin Zeng
Lanyu Shang
Yifan Liu
Yang Zhang
Dong Wang
RALM
43
2
0
14 Jun 2024
Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing
Mao Li
Frederick Conrad
45
1
0
11 Jun 2024
Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference
Jihwan Bang
Juntae Lee
Kyuhong Shim
Seunghan Yang
Simyung Chang
34
5
0
11 Jun 2024
Are Large Language Models Actually Good at Text Style Transfer?
Sourabrata Mukherjee
Atul Kr. Ojha
Ondrej Dusek
31
11
0
09 Jun 2024
SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context Large Language Models
Hengyu Zhang
RALM
47
2
0
09 Jun 2024
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Yongbin Li
29
30
0
09 Jun 2024
Large Generative Graph Models
Yu Wang
Ryan A. Rossi
Namyong Park
Huiyuan Chen
Nesreen K. Ahmed
Puja Trivedi
Franck Dernoncourt
Danai Koutra
Tyler Derr
AI4CE
39
3
0
07 Jun 2024
ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering
Raphael Gruber
Abdelrahman Abdallah
Michael Färber
Adam Jatowt
35
5
0
07 Jun 2024
Extroversion or Introversion? Controlling The Personality of Your Large Language Models
Yanquan Chen
Zhen Wu
Junjie Guo
Shujian Huang
Xinyu Dai
28
0
0
07 Jun 2024
Time Sensitive Knowledge Editing through Efficient Finetuning
Xiou Ge
Ali Mousavi
Edouard Grave
Armand Joulin
Kun Qian
Benjamin Han
Mostafa Arefiyan
Yunyao Li
KELM
36
7
0
06 Jun 2024
Causal Estimation of Memorisation Profiles
Pietro Lesci
Clara Meister
Thomas Hofmann
Andreas Vlachos
Tiago Pimentel
51
5
0
06 Jun 2024
On The Persona-based Summarization of Domain-Specific Documents
Ankan Mullick
Sombit Bose
Rounak Saha
Ayan Kumar Bhowmick
Pawan Goyal
Niloy Ganguly
Prasenjit Dey
Ravi Kokku
35
2
0
06 Jun 2024
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering
Anand Subramanian
Viktor Schlegel
Abhinav Ramesh Kashyap
Thanh-Tung Nguyen
Vijay Prakash Dwivedi
Stefan Winkler
ELM
LM&MA
AI4MH
33
3
0
06 Jun 2024
Does your data spark joy? Performance gains from domain upsampling at the end of training
Cody Blakeney
Mansheej Paul
Brett W. Larsen
Sean Owen
Jonathan Frankle
29
19
0
05 Jun 2024
PatentEval: Understanding Errors in Patent Generation
You Zuo
Kim Gerdes
Eric Villemonte de la Clergerie
Benoît Sagot
29
1
0
05 Jun 2024
Zyda: A 1.3T Dataset for Open Language Modeling
Yury Tokpanov
Beren Millidge
Paolo Glorioso
Jonathan Pilault
Adam Ibrahim
James Whittington
Quentin Anthony
45
2
0
04 Jun 2024
GOMAA-Geo: GOal Modality Agnostic Active Geo-localization
Anindya Sarkar
S. Sastry
Aleksis Pirinen
Chongjie Zhang
Nathan Jacobs
Yevgeniy Vorobeychik
49
5
0
04 Jun 2024
Towards Transparency: Exploring LLM Trainings Datasets through Visual Topic Modeling and Semantic Frame
Charles de Dampierre
Andrei Mogoutov
Nicolas Baumard
50
1
0
03 Jun 2024
SPOT: Text Source Prediction from Originality Score Thresholding
Edouard Yvinec
Gabriel Kasser
DeLMO
46
0
0
30 May 2024
X-VILA: Cross-Modality Alignment for Large Language Model
Hanrong Ye
De-An Huang
Yao Lu
Zhiding Yu
Ming-Yu Liu
...
Jan Kautz
Song Han
Dan Xu
Pavlo Molchanov
Hongxu Yin
MLLM
VLM
45
30
0
29 May 2024
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Ge Zhang
Scott Qu
Jiaheng Liu
Chenchen Zhang
Chenghua Lin
...
Zi-Kai Zhao
Jiajun Zhang
Wanli Ouyang
Wenhao Huang
Wenhu Chen
ELM
43
44
0
29 May 2024
Understanding Intrinsic Socioeconomic Biases in Large Language Models
Mina Arzaghi
Florian Carichon
G. Farnadi
29
0
0
28 May 2024
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
Yuanpu Cao
Tianrong Zhang
Bochuan Cao
Ziyi Yin
Lu Lin
Fenglong Ma
Jinghui Chen
LLMSV
37
20
0
28 May 2024
Empowering Character-level Text Infilling by Eliminating Sub-Tokens
Houxing Ren
Mingjie Zhan
Zhongyuan Wu
Hongsheng Li
AI4CE
32
1
0
27 May 2024
Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings
Robert Wolfe
Isaac Slaughter
Bin Han
Bingbing Wen
Yiwei Yang
...
Bernease Herman
E. Brown
Zening Qu
Nicholas Weber
Bill Howe
46
4
0
27 May 2024
Zamba: A Compact 7B SSM Hybrid Model
Paolo Glorioso
Quentin G. Anthony
Yury Tokpanov
James Whittington
Jonathan Pilault
Adam Ibrahim
Beren Millidge
30
45
0
26 May 2024
gzip Predicts Data-dependent Scaling Laws
Rohan Pandey
32
10
0
26 May 2024
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping-Chia Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
58
36
0
26 May 2024
Multi-Reference Preference Optimization for Large Language Models
Hung Le
Quan Tran
D. Nguyen
Kien Do
Saloni Mittal
Kelechi Ogueji
Svetha Venkatesh
65
0
0
26 May 2024
Activator: GLU Activation Function as the Core Component of a Vision Transformer
Abdullah Nazhat Abdullah
Tarkan Aydin
ViT
43
0
0
24 May 2024
GECKO: Generative Language Model for English, Code and Korean
Sungwoo Oh
Donggyu Kim
VLM
35
0
0
24 May 2024
The Mosaic Memory of Large Language Models
Igor Shilov
Matthieu Meeus
Yves-Alexandre de Montjoye
47
3
0
24 May 2024
360Zhinao Technical Report
360Zhinao Team
40
0
0
22 May 2024
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs
Alireza Ghaffari
Sharareh Younesian
Vahid Partovi Nia
Boxing Chen
M. Asgharian
MQ
55
0
0
22 May 2024
Identifying and Aligning Medical Claims Made on Social Media with Medical Evidence
Anthony James Hughes
Xingyi Song
23
1
0
18 May 2024
Automated Radiology Report Generation: A Review of Recent Advances
Phillip Sloan
Philip Clatworthy
Edwin Simpson
Majid Mirmehdi
32
17
0
17 May 2024
Dynamic data sampler for cross-language transfer learning in large language models
Yudong Li
Yuhao Feng
Wen Zhou
Zhe Zhao
Linlin Shen
Cheng-An Hou
Xianxu Hou
46
0
0
17 May 2024
Simultaneous Masking, Not Prompting Optimization: A Paradigm Shift in Fine-tuning LLMs for Simultaneous Translation
Matthew Raffel
Victor Agostinelli
Lizhong Chen
41
5
0
16 May 2024
Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models
Majid Zarharan
Pascal Wullschleger
Babak Behkam Kia
Mohammad Taher Pilehvar
Jennifer Foster
LRM
ELM
LM&MA
33
3
0
15 May 2024
RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors
Liam Dugan
Alyssa Hwang
Filip Trhlik
Josh Magnus Ludan
Andrew Zhu
Hainiu Xu
Daphne Ippolito
Christopher Callison-Burch
DeLMO
AAML
35
44
0
13 May 2024
Previous
1
2
3
4
5
...
10
11
12
Next