ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.16570
  4. Cited By
URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training

URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training

22 May 2025
Dongyang Fan
Vinko Sabolčec
Martin Jaggi
ArXiv (abs)PDFHTML

Papers citing "URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training"

4 / 4 papers shown
Title
When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars
When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars
Rei Higuchi
Ryotaro Kawata
Naoki Nishikawa
Kazusato Oko
Shoichiro Yamaguchi
Sosuke Kobayashi
Seiya Tokui
K. Hayashi
Daisuke Okanohara
Taiji Suzuki
AI4CE
83
1
0
24 Apr 2025
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation
Alexander Wettig
Kyle Lo
Sewon Min
Hannaneh Hajishirzi
Danqi Chen
Luca Soldaini
152
16
0
14 Feb 2025
OLMES: A Standard for Language Model Evaluations
OLMES: A Standard for Language Model Evaluations
Yuling Gu
Oyvind Tafjord
Bailey Kuehl
Dany Haddad
Jesse Dodge
Hannaneh Hajishirzi
ELM
124
20
0
12 Jun 2024
Understanding Emergent Abilities of Language Models from the Loss Perspective
Understanding Emergent Abilities of Language Models from the Loss Perspective
Zhengxiao Du
Aohan Zeng
Yuxiao Dong
Jie Tang
UQCVLRM
151
56
0
23 Mar 2024
1