ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.12316
  4. Cited By
Simplicity Bias in Transformers and their Ability to Learn Sparse
  Boolean Functions

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions

22 November 2022
S. Bhattamishra
Arkil Patel
Varun Kanade
Phil Blunsom
ArXivPDFHTML

Papers citing "Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions"

36 / 36 papers shown
Title
Transformers Can Overcome the Curse of Dimensionality: A Theoretical Study from an Approximation Perspective
Transformers Can Overcome the Curse of Dimensionality: A Theoretical Study from an Approximation Perspective
Yuling Jiao
Yanming Lai
Yang Wang
Bokai Yan
39
0
0
18 Apr 2025
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
LRM
48
0
0
13 Mar 2025
Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild
Damien Teney
Liangze Jiang
Florin Gogianu
Ehsan Abbasnejad
169
0
0
13 Mar 2025
A distributional simplicity bias in the learning dynamics of transformers
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
A. Laio
Sebastian Goldt
79
8
0
17 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
180
0
0
04 Feb 2025
Exploring Grokking: Experimental and Mechanistic Investigations
Exploring Grokking: Experimental and Mechanistic Investigations
Hu Qiye
Zhou Hao
Yu RuoXi
79
1
0
14 Dec 2024
Training Neural Networks as Recognizers of Formal Languages
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi
Ghazal Khalighinejad
Anej Svete
Josef Valvoda
Ryan Cotterell
Brian DuSell
NAI
44
2
0
11 Nov 2024
The Mystery of the Pathological Path-star Task for Language Models
The Mystery of the Pathological Path-star Task for Language Models
Arvid Frydenlund
LRM
27
4
0
17 Oct 2024
Why Do You Grok? A Theoretical Analysis of Grokking Modular Addition
Why Do You Grok? A Theoretical Analysis of Grokking Modular Addition
Mohamad Amin Mohamadi
Zhiyuan Li
Lei Wu
Danica J. Sutherland
48
9
0
17 Jul 2024
Exploiting the equivalence between quantum neural networks and
  perceptrons
Exploiting the equivalence between quantum neural networks and perceptrons
Chris Mingard
Jessica Pointing
Charles London
Yoonsoo Nam
Ard A. Louis
35
2
0
05 Jul 2024
Early learning of the optimal constant solution in neural networks and
  humans
Early learning of the optimal constant solution in neural networks and humans
Jirko Rubruck
Jan P. Bauer
Andrew M. Saxe
Christopher Summerfield
33
1
0
25 Jun 2024
Language Models Need Inductive Biases to Count Inductively
Language Models Need Inductive Biases to Count Inductively
Yingshan Chang
Yonatan Bisk
LRM
32
5
0
30 May 2024
IM-Context: In-Context Learning for Imbalanced Regression Tasks
IM-Context: In-Context Learning for Imbalanced Regression Tasks
Ismail Nejjar
Faez Ahmed
Olga Fink
35
1
0
28 May 2024
A rationale from frequency perspective for grokking in training neural
  network
A rationale from frequency perspective for grokking in training neural network
Zhangchen Zhou
Yaoyu Zhang
Z. Xu
40
2
0
24 May 2024
Learning Syntax Without Planting Trees: Understanding Hierarchical Generalization in Transformers
Learning Syntax Without Planting Trees: Understanding Hierarchical Generalization in Transformers
Kabir Ahuja
Vidhisha Balachandran
Madhur Panwar
Tianxing He
Noah A. Smith
Navin Goyal
Yulia Tsvetkov
41
8
0
25 Apr 2024
TEL'M: Test and Evaluation of Language Models
TEL'M: Test and Evaluation of Language Models
G. Cybenko
Joshua Ackerman
Paul Lintilhac
ALM
ELM
40
0
0
16 Apr 2024
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic
  Representations
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations
Deqing Fu
Ghazal Khalighinejad
Ollie Liu
Bhuwan Dhingra
Dani Yogatama
Robin Jia
W. Neiswanger
33
14
0
01 Apr 2024
Neural Redshift: Random Networks are not Random Functions
Neural Redshift: Random Networks are not Random Functions
Damien Teney
A. Nicolicioiu
Valentin Hartmann
Ehsan Abbasnejad
103
18
0
04 Mar 2024
Out-of-Domain Generalization in Dynamical Systems Reconstruction
Out-of-Domain Generalization in Dynamical Systems Reconstruction
Niclas Alexander Göring
Florian Hess
Manuel Brenner
Zahra Monfared
Daniel Durstewitz
AI4CE
35
10
0
28 Feb 2024
Why are Sensitive Functions Hard for Transformers?
Why are Sensitive Functions Hard for Transformers?
Michael Hahn
Mark Rofin
41
25
0
15 Feb 2024
Towards Understanding Inductive Bias in Transformers: A View From
  Infinity
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie
Guy Gur-Ari
Z. Ringel
34
1
0
07 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
39
1
0
01 Feb 2024
Simplicity bias, algorithmic probability, and the random logistic map
Simplicity bias, algorithmic probability, and the random logistic map
B. Hamzi
K. Dingle
23
3
0
31 Dec 2023
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce
  Grokking
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
41
32
0
30 Nov 2023
Looped Transformers are Better at Learning Learning Algorithms
Looped Transformers are Better at Learning Learning Algorithms
Liu Yang
Kangwook Lee
Robert D. Nowak
Dimitris Papailiopoulos
24
24
0
21 Nov 2023
How are Prompts Different in Terms of Sensitivity?
How are Prompts Different in Terms of Sensitivity?
Sheng Lu
Hendrik Schuff
Iryna Gurevych
37
18
0
13 Nov 2023
What Formal Languages Can Transformers Express? A Survey
What Formal Languages Can Transformers Express? A Survey
Lena Strobl
William Merrill
Gail Weiss
David Chiang
Dana Angluin
AI4CE
20
48
0
01 Nov 2023
What Algorithms can Transformers Learn? A Study in Length Generalization
What Algorithms can Transformers Learn? A Study in Length Generalization
Hattie Zhou
Arwen Bradley
Etai Littwin
Noam Razin
Omid Saremi
Josh Susskind
Samy Bengio
Preetum Nakkiran
34
110
0
24 Oct 2023
Understanding In-Context Learning in Transformers and LLMs by Learning
  to Learn Discrete Functions
Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
S. Bhattamishra
Arkil Patel
Phil Blunsom
Varun Kanade
21
41
0
04 Oct 2023
In-Context Learning through the Bayesian Prism
In-Context Learning through the Bayesian Prism
Madhuri Panwar
Kabir Ahuja
Navin Goyal
BDL
34
38
0
08 Jun 2023
Representational Strengths and Limitations of Transformers
Representational Strengths and Limitations of Transformers
Clayton Sanford
Daniel J. Hsu
Matus Telgarsky
22
81
0
05 Jun 2023
MLRegTest: A Benchmark for the Machine Learning of Regular Languages
MLRegTest: A Benchmark for the Machine Learning of Regular Languages
Sam van der Poel
D. Lambert
Kalina Kostyszyn
Tiantian Gao
Rahul Verma
...
Emily Peterson
C. S. Clair
Paul Fodor
Chihiro Shibata
Jeffrey Heinz
ELM
17
8
0
16 Apr 2023
Do deep neural networks have an inbuilt Occam's razor?
Do deep neural networks have an inbuilt Occam's razor?
Chris Mingard
Henry Rees
Guillermo Valle Pérez
A. Louis
UQCV
BDL
21
16
0
13 Apr 2023
Neural Networks and the Chomsky Hierarchy
Neural Networks and the Chomsky Hierarchy
Grégoire Delétang
Anian Ruoss
Jordi Grau-Moya
Tim Genewein
L. Wenliang
...
Chris Cundy
Marcus Hutter
Shane Legg
Joel Veness
Pedro A. Ortega
UQCV
107
130
0
05 Jul 2022
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Michael Hahn
Dan Jurafsky
Richard Futrell
150
22
0
21 Apr 2021
Memorisation versus Generalisation in Pre-trained Language Models
Memorisation versus Generalisation in Pre-trained Language Models
Michael Tänzer
Sebastian Ruder
Marek Rei
94
50
0
16 Apr 2021
1