ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.01116
  4. Cited By
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

1 June 2023
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
ArXivPDFHTML

Papers citing "The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only"

50 / 587 papers shown
Title
Building on Efficient Foundations: Effectively Training LLMs with
  Structured Feedforward Layers
Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers
Xiuying Wei
Skander Moalla
Razvan Pascanu
Çağlar Gülçehre
34
0
0
24 Jun 2024
Compensate Quantization Errors: Make Weights Hierarchical to Compensate
  Each Other
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
Yifei Gao
Jie Ou
Lei Wang
Yuting Xiao
Zhiyuan Xiang
Ruiting Dai
Jun Cheng
MQ
36
3
0
24 Jun 2024
Efficient Continual Pre-training by Mitigating the Stability Gap
Efficient Continual Pre-training by Mitigating the Stability Gap
Yiduo Guo
Jie Fu
Huishuai Zhang
Dongyan Zhao
Songlin Yang
30
13
0
21 Jun 2024
Instruction Pre-Training: Language Models are Supervised Multitask
  Learners
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Daixuan Cheng
Yuxian Gu
Shaohan Huang
Junyu Bi
Minlie Huang
Furu Wei
SyDa
65
20
0
20 Jun 2024
SEC-QA: A Systematic Evaluation Corpus for Financial QA
SEC-QA: A Systematic Evaluation Corpus for Financial QA
Viet Dac Lai
Michael Krumdick
Charles Lovering
Varshini Reddy
Craig W. Schmidt
Chris Tanner
56
3
0
20 Jun 2024
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and
  Mitigation Strategies for Large Language Models
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models
Jie Chen
Yupeng Zhang
Bingning Wang
Wayne Xin Zhao
Ji-Rong Wen
Weipeng Chen
SyDa
42
4
0
18 Jun 2024
LiLiuM: eBay's Large Language Models for e-commerce
LiLiuM: eBay's Large Language Models for e-commerce
Christian Herold
Michael Kozielski
Leonid Ekimov
Pavel Petrushkov
P. Vandenbussche
Shahram Khadivi
43
1
0
17 Jun 2024
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal
  Dataset with One Trillion Tokens
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
Anas Awadalla
Le Xue
Oscar Lo
Manli Shu
Hannah Lee
...
Silvio Savarese
Caiming Xiong
Ran Xu
Yejin Choi
Ludwig Schmidt
69
25
0
17 Jun 2024
How Good are LLMs at Relation Extraction under Low-Resource Scenario?
  Comprehensive Evaluation
How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation
Dawulie Jinensibieke
M. Maimaiti
Wentao Xiao
Yuanhang Zheng
Xiaobo Wang
54
2
0
17 Jun 2024
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language
  Model Pre-training
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training
David Brandfonbrener
Hanlin Zhang
Andreas Kirsch
Jonathan Richard Schwarz
Sham Kakade
28
7
0
15 Jun 2024
Retrieval Augmented Fact Verification by Synthesizing Contrastive
  Arguments
Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments
Zhenrui Yue
Huimin Zeng
Lanyu Shang
Yifan Liu
Yang Zhang
Dong Wang
RALM
43
2
0
14 Jun 2024
Advancing Annotation of Stance in Social Media Posts: A Comparative
  Analysis of Large Language Models and Crowd Sourcing
Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing
Mao Li
Frederick Conrad
45
1
0
11 Jun 2024
Crayon: Customized On-Device LLM via Instant Adapter Blending and
  Edge-Server Hybrid Inference
Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference
Jihwan Bang
Juntae Lee
Kyuhong Shim
Seunghan Yang
Simyung Chang
34
5
0
11 Jun 2024
Are Large Language Models Actually Good at Text Style Transfer?
Are Large Language Models Actually Good at Text Style Transfer?
Sourabrata Mukherjee
Atul Kr. Ojha
Ondrej Dusek
31
11
0
09 Jun 2024
SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context
  Large Language Models
SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context Large Language Models
Hengyu Zhang
RALM
47
2
0
09 Jun 2024
How Alignment and Jailbreak Work: Explain LLM Safety through
  Intermediate Hidden States
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Yongbin Li
29
30
0
09 Jun 2024
Large Generative Graph Models
Large Generative Graph Models
Yu Wang
Ryan A. Rossi
Namyong Park
Huiyuan Chen
Nesreen K. Ahmed
Puja Trivedi
Franck Dernoncourt
Danai Koutra
Tyler Derr
AI4CE
39
3
0
07 Jun 2024
ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question
  Answering
ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering
Raphael Gruber
Abdelrahman Abdallah
Michael Färber
Adam Jatowt
35
5
0
07 Jun 2024
Extroversion or Introversion? Controlling The Personality of Your Large
  Language Models
Extroversion or Introversion? Controlling The Personality of Your Large Language Models
Yanquan Chen
Zhen Wu
Junjie Guo
Shujian Huang
Xinyu Dai
28
0
0
07 Jun 2024
Time Sensitive Knowledge Editing through Efficient Finetuning
Time Sensitive Knowledge Editing through Efficient Finetuning
Xiou Ge
Ali Mousavi
Edouard Grave
Armand Joulin
Kun Qian
Benjamin Han
Mostafa Arefiyan
Yunyao Li
KELM
36
7
0
06 Jun 2024
Causal Estimation of Memorisation Profiles
Causal Estimation of Memorisation Profiles
Pietro Lesci
Clara Meister
Thomas Hofmann
Andreas Vlachos
Tiago Pimentel
51
5
0
06 Jun 2024
On The Persona-based Summarization of Domain-Specific Documents
On The Persona-based Summarization of Domain-Specific Documents
Ankan Mullick
Sombit Bose
Rounak Saha
Ayan Kumar Bhowmick
Pawan Goyal
Niloy Ganguly
Prasenjit Dey
Ravi Kokku
35
2
0
06 Jun 2024
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and
  Knowledge Recall in Large Language Models via Question Answering
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering
Anand Subramanian
Viktor Schlegel
Abhinav Ramesh Kashyap
Thanh-Tung Nguyen
Vijay Prakash Dwivedi
Stefan Winkler
ELM
LM&MA
AI4MH
33
3
0
06 Jun 2024
Does your data spark joy? Performance gains from domain upsampling at
  the end of training
Does your data spark joy? Performance gains from domain upsampling at the end of training
Cody Blakeney
Mansheej Paul
Brett W. Larsen
Sean Owen
Jonathan Frankle
29
19
0
05 Jun 2024
PatentEval: Understanding Errors in Patent Generation
PatentEval: Understanding Errors in Patent Generation
You Zuo
Kim Gerdes
Eric Villemonte de la Clergerie
Benoît Sagot
29
1
0
05 Jun 2024
Zyda: A 1.3T Dataset for Open Language Modeling
Zyda: A 1.3T Dataset for Open Language Modeling
Yury Tokpanov
Beren Millidge
Paolo Glorioso
Jonathan Pilault
Adam Ibrahim
James Whittington
Quentin Anthony
45
2
0
04 Jun 2024
GOMAA-Geo: GOal Modality Agnostic Active Geo-localization
GOMAA-Geo: GOal Modality Agnostic Active Geo-localization
Anindya Sarkar
S. Sastry
Aleksis Pirinen
Chongjie Zhang
Nathan Jacobs
Yevgeniy Vorobeychik
49
5
0
04 Jun 2024
Towards Transparency: Exploring LLM Trainings Datasets through Visual
  Topic Modeling and Semantic Frame
Towards Transparency: Exploring LLM Trainings Datasets through Visual Topic Modeling and Semantic Frame
Charles de Dampierre
Andrei Mogoutov
Nicolas Baumard
50
1
0
03 Jun 2024
SPOT: Text Source Prediction from Originality Score Thresholding
SPOT: Text Source Prediction from Originality Score Thresholding
Edouard Yvinec
Gabriel Kasser
DeLMO
46
0
0
30 May 2024
X-VILA: Cross-Modality Alignment for Large Language Model
X-VILA: Cross-Modality Alignment for Large Language Model
Hanrong Ye
De-An Huang
Yao Lu
Zhiding Yu
Ming-Yu Liu
...
Jan Kautz
Song Han
Dan Xu
Pavlo Molchanov
Hongxu Yin
MLLM
VLM
45
30
0
29 May 2024
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model
  Series
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Ge Zhang
Scott Qu
Jiaheng Liu
Chenchen Zhang
Chenghua Lin
...
Zi-Kai Zhao
Jiajun Zhang
Wanli Ouyang
Wenhao Huang
Wenhu Chen
ELM
43
44
0
29 May 2024
Understanding Intrinsic Socioeconomic Biases in Large Language Models
Understanding Intrinsic Socioeconomic Biases in Large Language Models
Mina Arzaghi
Florian Carichon
G. Farnadi
29
0
0
28 May 2024
Personalized Steering of Large Language Models: Versatile Steering
  Vectors Through Bi-directional Preference Optimization
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
Yuanpu Cao
Tianrong Zhang
Bochuan Cao
Ziyi Yin
Lu Lin
Fenglong Ma
Jinghui Chen
LLMSV
37
20
0
28 May 2024
Empowering Character-level Text Infilling by Eliminating Sub-Tokens
Empowering Character-level Text Infilling by Eliminating Sub-Tokens
Houxing Ren
Mingjie Zhan
Zhongyuan Wu
Hongsheng Li
AI4CE
32
1
0
27 May 2024
Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT
  Even in Low-Resource Settings
Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings
Robert Wolfe
Isaac Slaughter
Bin Han
Bingbing Wen
Yiwei Yang
...
Bernease Herman
E. Brown
Zening Qu
Nicholas Weber
Bill Howe
46
4
0
27 May 2024
Zamba: A Compact 7B SSM Hybrid Model
Zamba: A Compact 7B SSM Hybrid Model
Paolo Glorioso
Quentin G. Anthony
Yury Tokpanov
James Whittington
Jonathan Pilault
Adam Ibrahim
Beren Millidge
30
45
0
26 May 2024
gzip Predicts Data-dependent Scaling Laws
gzip Predicts Data-dependent Scaling Laws
Rohan Pandey
32
10
0
26 May 2024
A Survey of Multimodal Large Language Model from A Data-centric
  Perspective
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping-Chia Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
58
36
0
26 May 2024
Multi-Reference Preference Optimization for Large Language Models
Multi-Reference Preference Optimization for Large Language Models
Hung Le
Quan Tran
D. Nguyen
Kien Do
Saloni Mittal
Kelechi Ogueji
Svetha Venkatesh
65
0
0
26 May 2024
Activator: GLU Activation Function as the Core Component of a Vision
  Transformer
Activator: GLU Activation Function as the Core Component of a Vision Transformer
Abdullah Nazhat Abdullah
Tarkan Aydin
ViT
43
0
0
24 May 2024
GECKO: Generative Language Model for English, Code and Korean
GECKO: Generative Language Model for English, Code and Korean
Sungwoo Oh
Donggyu Kim
VLM
35
0
0
24 May 2024
The Mosaic Memory of Large Language Models
The Mosaic Memory of Large Language Models
Igor Shilov
Matthieu Meeus
Yves-Alexandre de Montjoye
47
3
0
24 May 2024
360Zhinao Technical Report
360Zhinao Technical Report
360Zhinao Team
40
0
0
22 May 2024
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization
  Method for LLMs
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs
Alireza Ghaffari
Sharareh Younesian
Vahid Partovi Nia
Boxing Chen
M. Asgharian
MQ
55
0
0
22 May 2024
Identifying and Aligning Medical Claims Made on Social Media with
  Medical Evidence
Identifying and Aligning Medical Claims Made on Social Media with Medical Evidence
Anthony James Hughes
Xingyi Song
23
1
0
18 May 2024
Automated Radiology Report Generation: A Review of Recent Advances
Automated Radiology Report Generation: A Review of Recent Advances
Phillip Sloan
Philip Clatworthy
Edwin Simpson
Majid Mirmehdi
32
17
0
17 May 2024
Dynamic data sampler for cross-language transfer learning in large
  language models
Dynamic data sampler for cross-language transfer learning in large language models
Yudong Li
Yuhao Feng
Wen Zhou
Zhe Zhao
Linlin Shen
Cheng-An Hou
Xianxu Hou
46
0
0
17 May 2024
Simultaneous Masking, Not Prompting Optimization: A Paradigm Shift in
  Fine-tuning LLMs for Simultaneous Translation
Simultaneous Masking, Not Prompting Optimization: A Paradigm Shift in Fine-tuning LLMs for Simultaneous Translation
Matthew Raffel
Victor Agostinelli
Lizhong Chen
41
5
0
16 May 2024
Tell Me Why: Explainable Public Health Fact-Checking with Large Language
  Models
Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models
Majid Zarharan
Pascal Wullschleger
Babak Behkam Kia
Mohammad Taher Pilehvar
Jennifer Foster
LRM
ELM
LM&MA
33
3
0
15 May 2024
RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text
  Detectors
RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors
Liam Dugan
Alyssa Hwang
Filip Trhlik
Josh Magnus Ludan
Andrew Zhu
Hainiu Xu
Daphne Ippolito
Christopher Callison-Burch
DeLMO
AAML
35
44
0
13 May 2024
Previous
12345...101112
Next