Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.03648
Cited By
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
7 October 2020
Nikunj Saunshi
Sadhika Malladi
Sanjeev Arora
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks"
50 / 66 papers shown
Title
AutoMathKG: The automated mathematical knowledge graph based on LLM and vector database
Rong Bian
Yu Geng
Zijian Yang
Bing Cheng
17
0
0
19 May 2025
On Next-Token Prediction in LLMs: How End Goals Determine the Consistency of Decoding Algorithms
Jacob Trauger
Ambuj Tewari
22
0
0
16 May 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
70
9
0
20 Feb 2025
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
83
4
0
10 Feb 2025
Semantic Captioning: Benchmark Dataset and Graph-Aware Few-Shot In-Context Learning for SQL2Text
Ali Al-Lawati
Jason Lucas
Prasenjit Mitra
LMTD
48
0
0
06 Jan 2025
"My life is miserable, have to sign 500 autographs everyday": Exposing Humblebragging, the Brags in Disguise
Sharath Naganna
Saprativa Bhattacharjee
Pushpak Bhattacharyya
Biplab Banerjee
36
0
0
31 Dec 2024
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
99
16
0
11 Oct 2024
Investigating the Impact of Model Complexity in Large Language Models
Jing Luo
Huiyuan Wang
Weiran Huang
41
0
0
01 Oct 2024
On the Inductive Bias of Stacking Towards Improving Reasoning
Nikunj Saunshi
Stefani Karp
Shankar Krishnan
Sobhan Miryoosefi
Sashank J. Reddi
Sanjiv Kumar
LRM
AI4CE
44
4
0
27 Sep 2024
BERTs are Generative In-Context Learners
David Samuel
48
6
0
07 Jun 2024
LOLA: LLM-Assisted Online Learning Algorithm for Content Experiments
Zikun Ye
Hema Yoganarasimhan
Yufeng Zheng
52
2
0
03 Jun 2024
Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning
Kang Du
Yu Xiang
27
0
0
10 Feb 2024
Cheap Learning: Maximising Performance of Language Models for Social Data Science Using Minimal Data
Leonardo Castro-Gonzalez
Yi-Ling Chung
Hannak Rose Kirk
John Francis
Angus R. Williams
Pica Johansson
Jonathan Bright
50
1
0
22 Jan 2024
Probing Biological and Artificial Neural Networks with Task-dependent Neural Manifolds
Michael Kuoch
Chi-Ning Chou
Nikhil Parthasarathy
Joel Dapello
J. DiCarlo
H. Sompolinsky
SueYeon Chung
19
1
0
21 Dec 2023
Latent Skill Discovery for Chain-of-Thought Reasoning
Zifan Xu
Haozhu Wang
Dmitriy Bespalov
Peter Stone
Yanjun Qi
ReLM
LRM
59
2
0
07 Dec 2023
Data Similarity is Not Enough to Explain Language Model Performance
Gregory Yauney
Emily Reif
David M. Mimno
50
6
0
15 Nov 2023
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?
Ahmed Alajrami
Katerina Margatina
Nikolaos Aletras
AAML
19
1
0
26 Oct 2023
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
Zhihan Liu
Hao Hu
Shenao Zhang
Hongyi Guo
Shuqi Ke
Boyi Liu
Zhaoran Wang
LLMAG
LRM
36
34
0
29 Sep 2023
Regularization and Optimal Multiclass Learning
Julian Asilis
Siddartha Devic
S. Dughmi
Vatsal Sharan
S. Teng
32
7
0
24 Sep 2023
Representation Learning Dynamics of Self-Supervised Models
P. Esser
Satyaki Mukherjee
D. Ghoshdastidar
SSL
34
3
0
05 Sep 2023
GradientCoin: A Peer-to-Peer Decentralized Large Language Models
Yeqi Gao
Zhao Song
Junze Yin
41
18
0
21 Aug 2023
Reasoning in Large Language Models Through Symbolic Math Word Problems
Vedant Gaur
Nikunj Saunshi
ReLM
LRM
35
26
0
03 Aug 2023
A Theory for Emergence of Complex Skills in Language Models
Sanjeev Arora
Anirudh Goyal
LRM
29
73
0
29 Jul 2023
Trainable Transformer in Transformer
A. Panigrahi
Sadhika Malladi
Mengzhou Xia
Sanjeev Arora
VLM
32
13
0
03 Jul 2023
In-Context Learning through the Bayesian Prism
Madhuri Panwar
Kabir Ahuja
Navin Goyal
BDL
42
40
0
08 Jun 2023
Early Weight Averaging meets High Learning Rates for LLM Pre-training
Sunny Sanyal
A. Neerkaje
Jean Kaddour
Abhishek Kumar
Sujay Sanghavi
MoMe
41
17
0
05 Jun 2023
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
43
180
0
27 May 2023
Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations
Chenglei Si
Dan Friedman
Nitish Joshi
Shi Feng
Danqi Chen
He He
13
42
0
22 May 2023
Fundamental Limitations of Alignment in Large Language Models
Yotam Wolf
Noam Wies
Oshri Avnery
Yoav Levine
Amnon Shashua
ALM
19
140
0
19 Apr 2023
The Learnability of In-Context Learning
Noam Wies
Yoav Levine
Amnon Shashua
122
97
0
14 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
35
42
0
10 Mar 2023
On the Provable Advantage of Unsupervised Pretraining
Jiawei Ge
Shange Tang
Jianqing Fan
Chi Jin
SSL
33
16
0
02 Mar 2023
Task-Specific Skill Localization in Fine-tuned Language Models
A. Panigrahi
Nikunj Saunshi
Haoyu Zhao
Sanjeev Arora
MoMe
34
69
0
13 Feb 2023
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning
Xinyi Wang
Wanrong Zhu
Michael Stephen Saxon
Mark Steyvers
William Yang Wang
BDL
56
93
0
27 Jan 2023
Training Trajectories of Language Models Across Scales
Mengzhou Xia
Mikel Artetxe
Chunting Zhou
Xi Lin
Ramakanth Pasunuru
Danqi Chen
Luke Zettlemoyer
Ves Stoyanov
AIFin
LRM
39
54
0
19 Dec 2022
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
Kyuyong Shin
Hanock Kwak
Wonjae Kim
Jisu Jeong
Seungjae Jung
KyungHyun Kim
Jung-Woo Ha
Sang-Woo Lee
27
4
0
07 Dec 2022
A Theoretical Study of Inductive Biases in Contrastive Learning
Jeff Z. HaoChen
Tengyu Ma
UQCV
SSL
36
31
0
27 Nov 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
40
49
0
25 Oct 2022
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
78
62
0
11 Oct 2022
Antibody Representation Learning for Drug Discovery
Lin Li
Esther Gupta
J. Spaeth
Leslie Shing
Tristan Bepler
R. Caceres
32
6
0
05 Oct 2022
GROWN+UP: A Graph Representation Of a Webpage Network Utilizing Pre-training
Benedict Yeoh
Huijuan Wang
GNN
31
1
0
03 Aug 2022
Improving self-supervised pretraining models for epileptic seizure detection from EEG data
Sudip Das
Pankaja Pandey
Krishna P. Miyapuram
MedIm
21
4
0
28 Jun 2022
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
90
2,364
0
15 Jun 2022
AANG: Automating Auxiliary Learning
Lucio Dery
Paul Michel
M. Khodak
Graham Neubig
Ameet Talwalkar
41
9
0
27 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
34
34
0
12 May 2022
Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey
Kento Nozawa
Issei Sato
AI4TS
26
4
0
18 Apr 2022
Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining?
Subhabrata Dutta
Jeevesh Juneja
Dipankar Das
Tanmoy Chakraborty
25
16
0
24 Mar 2022
Understanding Contrastive Learning Requires Incorporating Inductive Biases
Nikunj Saunshi
Jordan T. Ash
Surbhi Goel
Dipendra Kumar Misra
Cyril Zhang
Sanjeev Arora
Sham Kakade
A. Krishnamurthy
SSL
27
109
0
28 Feb 2022
Masked prediction tasks: a parameter identifiability view
Bingbin Liu
Daniel J. Hsu
Pradeep Ravikumar
Andrej Risteski
SSL
OOD
21
4
0
18 Feb 2022
Understanding The Robustness of Self-supervised Learning Through Topic Modeling
Zeping Luo
Shiyou Wu
C. Weng
Mo Zhou
Rong Ge
OOD
SSL
14
3
0
02 Feb 2022
1
2
Next