ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.06733
  4. Cited By
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining

Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining

10 February 2025
Daouda Sow
Herbert Woisetschläger
Saikiran Bulusu
Shiqiang Wang
Hans-Arno Jacobsen
Yingbin Liang
ArXivPDFHTML

Papers citing "Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining"

13 / 13 papers shown
Title
Energy-based Preference Optimization for Test-time Adaptation
Energy-based Preference Optimization for Test-time Adaptation
Yewon Han
Seoyun Yang
Taesup Kim
TTA
182
0
0
26 May 2025
DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning
DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning
Qi Cao
Ruiyi Wang
Ruiyi Zhang
Sai Ashish Somayajula
P. Xie
LRM
30
0
0
26 May 2025
ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining
ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining
Melis Ilayda Bal
Volkan Cevher
Michael Muehlebach
20
0
0
26 May 2025
DataRater: Meta-Learned Dataset Curation
DataRater: Meta-Learned Dataset Curation
Dan A. Calian
Gregory Farquhar
Iurii Kemaev
Luisa M. Zintgraf
Matteo Hessel
...
András Gyorgy
Tom Schaul
Jeffrey Dean
Hado van Hasselt
David Silver
117
0
0
23 May 2025
Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
  Improves LLM Generalization
Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization
Xuxi Chen
Zhendong Wang
Daouda Sow
Junjie Yang
Tianlong Chen
Yingbin Liang
Mingyuan Zhou
Zhangyang Wang
50
6
0
22 Feb 2024
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
531
13,788
0
15 Mar 2023
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
378
2,051
0
31 Dec 2020
Rethinking Importance Weighting for Deep Learning under Distribution
  Shift
Rethinking Importance Weighting for Deep Learning under Distribution Shift
Tongtong Fang
Nan Lu
Gang Niu
Masashi Sugiyama
45
139
0
08 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
498
41,106
0
28 May 2020
Accelerating Deep Learning by Focusing on the Biggest Losers
Accelerating Deep Learning by Focusing on the Biggest Losers
Angela H. Jiang
Daniel L.-K. Wong
Giulio Zhou
D. Andersen
J. Dean
...
Gauri Joshi
M. Kaminsky
M. Kozuch
Zachary Chase Lipton
Padmanabhan Pillai
46
120
0
02 Oct 2019
Learning to Reweight Examples for Robust Deep Learning
Learning to Reweight Examples for Robust Deep Learning
Mengye Ren
Wenyuan Zeng
Binh Yang
R. Urtasun
OOD
NoLa
125
1,419
0
24 Mar 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
443
129,831
0
12 Jun 2017
Online Batch Selection for Faster Training of Neural Networks
Online Batch Selection for Faster Training of Neural Networks
I. Loshchilov
Frank Hutter
ODL
64
299
0
19 Nov 2015
1