Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.02544
Cited By
Pretrained transformer efficiently learns low-dimensional target functions in-context
4 November 2024
Kazusato Oko
Yujin Song
Taiji Suzuki
Denny Wu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Pretrained transformer efficiently learns low-dimensional target functions in-context"
21 / 21 papers shown
Title
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
119
18
0
24 May 2024
Asymptotic theory of in-context learning by linear attention
Yue M. Lu
Mary I. Letey
Jacob A. Zavatone-Veth
Anindita Maiti
Cengiz Pehlevan
94
16
0
20 May 2024
How do Transformers perform In-Context Autoregressive Learning?
Michael E. Sander
Raja Giryes
Taiji Suzuki
Mathieu Blondel
Gabriel Peyré
86
10
0
08 Feb 2024
An Information-Theoretic Analysis of In-Context Learning
Hong Jun Jeon
Jason D. Lee
Qi Lei
Benjamin Van Roy
122
24
0
28 Jan 2024
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Liam Collins
Hamed Hassani
Mahdi Soltanolkotabi
Aryan Mokhtari
Sanjay Shakkottai
125
11
0
13 Jul 2023
Beyond NTK with Vanilla Gradient Descent: A Mean-Field Analysis of Neural Networks with Polynomial Width, Samples, and Time
Arvind V. Mahankali
Jeff Z. HaoChen
Kefan Dong
Margalit Glasgow
Tengyu Ma
MLT
38
14
0
28 Jun 2023
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
Yu Bai
Fan Chen
Haiquan Wang
Caiming Xiong
Song Mei
52
198
0
07 Jun 2023
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
MLT
99
29
0
29 May 2023
Learning time-scales in two-layers neural networks
Raphael Berthier
Andrea Montanari
Kangjie Zhou
174
38
0
28 Feb 2023
Transformers learn in-context by gradient descent
J. Oswald
Eyvind Niklasson
E. Randazzo
João Sacramento
A. Mordvintsev
A. Zhmoginov
Max Vladymyrov
MLT
118
496
0
15 Dec 2022
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
218
71
0
27 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
373
50
0
29 Sep 2022
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Shivam Garg
Dimitris Tsipras
Percy Liang
Gregory Valiant
145
514
0
01 Aug 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
108
133
0
18 Jul 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
91
129
0
03 May 2022
How rotational invariance of common kernels prevents generalization in high dimensions
Konstantin Donhauser
Mingqi Wu
Fanny Yang
80
24
0
09 Apr 2021
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
908
42,520
0
28 May 2020
Risks from Learned Optimization in Advanced Machine Learning Systems
Evan Hubinger
Chris van Merwijk
Vladimir Mikulik
Joar Skalse
Scott Garrabrant
99
154
0
05 Jun 2019
Linearized two-layers neural networks in high dimension
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
88
243
0
27 Apr 2019
Fundamental Limits of Weak Recovery with Applications to Phase Retrieval
Marco Mondelli
Andrea Montanari
87
122
0
20 Aug 2017
Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
Jean Barbier
Florent Krzakala
N. Macris
Léo Miolane
Lenka Zdeborová
94
268
0
10 Aug 2017
1