Pretrained transformer efficiently learns low-dimensional target functions in-context

4 November 2024

Papers citing "Pretrained transformer efficiently learns low-dimensional target functions in-context"

21 / 21 papers shown

Title
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions Luca Arnaboldi Yatin Dandi Florent Krzakala Luca Pesce Ludovic Stephan 119 18 0 24 May 2024
Asymptotic theory of in-context learning by linear attention Yue M. Lu Mary I. Letey Jacob A. Zavatone-Veth Anindita Maiti Cengiz Pehlevan 94 16 0 20 May 2024
How do Transformers perform In-Context Autoregressive Learning? Michael E. Sander Raja Giryes Taiji Suzuki Mathieu Blondel Gabriel Peyré 86 10 0 08 Feb 2024
An Information-Theoretic Analysis of In-Context Learning Hong Jun Jeon Jason D. Lee Qi Lei Benjamin Van Roy 122 24 0 28 Jan 2024
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks Liam Collins Hamed Hassani Mahdi Soltanolkotabi Aryan Mokhtari Sanjay Shakkottai 125 11 0 13 Jul 2023
Beyond NTK with Vanilla Gradient Descent: A Mean-Field Analysis of Neural Networks with Polynomial Width, Samples, and Time Arvind V. Mahankali Jeff Z. HaoChen Kefan Dong Margalit Glasgow Tengyu Ma MLT 38 14 0 28 Jun 2023
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection Yu Bai Fan Chen Haiquan Wang Caiming Xiong Song Mei 52 198 0 07 Jun 2023
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time Yatin Dandi Florent Krzakala Bruno Loureiro Luca Pesce Ludovic Stephan MLT 99 29 0 29 May 2023
Learning time-scales in two-layers neural networks Raphael Berthier Andrea Montanari Kangjie Zhou 174 38 0 28 Feb 2023
Transformers learn in-context by gradient descent J. Oswald Eyvind Niklasson E. Randazzo João Sacramento A. Mordvintsev A. Zhmoginov Max Vladymyrov MLT 118 496 0 15 Dec 2022
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 218 71 0 27 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD Alireza Mousavi-Hosseini Sejun Park M. Girotti Ioannis Mitliagkas Murat A. Erdogdu MLT 373 50 0 29 Sep 2022
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes Shivam Garg Dimitris Tsipras Percy Liang Gregory Valiant 145 514 0 01 Aug 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit Boaz Barak Benjamin L. Edelman Surbhi Goel Sham Kakade Eran Malach Cyril Zhang 108 133 0 18 Jul 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 91 129 0 03 May 2022
How rotational invariance of common kernels prevents generalization in high dimensions Konstantin Donhauser Mingqi Wu Fanny Yang 80 24 0 09 Apr 2021
Language Models are Few-Shot Learners Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan ... Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever Dario Amodei BDL 908 42,520 0 28 May 2020
Risks from Learned Optimization in Advanced Machine Learning Systems Evan Hubinger Chris van Merwijk Vladimir Mikulik Joar Skalse Scott Garrabrant 99 154 0 05 Jun 2019
Linearized two-layers neural networks in high dimension Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari MLT 88 243 0 27 Apr 2019
Fundamental Limits of Weak Recovery with Applications to Phase Retrieval Marco Mondelli Andrea Montanari 87 122 0 20 Aug 2017
Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models Jean Barbier Florent Krzakala N. Macris Léo Miolane Lenka Zdeborová 94 268 0 10 Aug 2017