Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2506.10952
Cited By
Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training
12 June 2025
Mozhi Zhang
Howe Tissue
Lu Wang
Xipeng Qiu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training"
1 / 1 papers shown
Title
Learning Dynamics in Continual Pre-Training for Large Language Models
Xingjin Wang
Howe Tissue
Lu Wang
Linjing Li
D. Zeng
CLL
75
0
0
12 May 2025
1