Understanding polysemanticity in neural networks through coding theory

Understanding polysemanticity in neural networks through coding theory

31 January 2024

Simon C. Marshall

Jan H. Kirchner

ArXiv (abs)PDF HTML

Papers citing "Understanding polysemanticity in neural networks through coding theory"

11 / 11 papers shown

Title
Toy Models of Superposition Nelson Elhage Tristan Hume Catherine Olsson Nicholas Schiefer T. Henighan ... Sam McCandlish Jared Kaplan Dario Amodei Martin Wattenberg C. Olah AAML MILM 183 368 0 21 Sep 2022
Towards Benchmarking Explainable Artificial Intelligence Methods Lars Holmberg 21 5 0 25 Aug 2022
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks Tilman Raukur A. Ho Stephen Casper Dylan Hadfield-Menell AAML AI4CE 93 132 0 27 Jul 2022
Planting Undetectable Backdoors in Machine Learning Models S. Goldwasser Michael P. Kim Vinod Vaikuntanathan Or Zamir AAML 45 71 0 14 Apr 2022
Adversarial Robustness on In- and Out-Distribution Improves Explainability Maximilian Augustin Alexander Meinke Matthias Hein OOD 157 102 0 20 Mar 2020
Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks Ruth C. Fong Andrea Vedaldi FAtt 71 264 0 10 Jan 2018
Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients A. Ross Finale Doshi-Velez AAML 147 682 0 26 Nov 2017
Overcoming catastrophic forgetting in neural networks J. Kirkpatrick Razvan Pascanu Neil C. Rabinowitz J. Veness Guillaume Desjardins ... A. Grabska-Barwinska Demis Hassabis Claudia Clopath D. Kumaran R. Hadsell CLL 369 7,518 0 02 Dec 2016
Learning without Forgetting Zhizhong Li Derek Hoiem CLL OOD SSL 298 4,408 0 29 Jun 2016
Deep Learning and the Information Bottleneck Principle Naftali Tishby Noga Zaslavsky DRL 207 1,584 0 09 Mar 2015
Representation Learning: A Review and New Perspectives Yoshua Bengio Aaron Courville Pascal Vincent OOD SSL 264 12,439 0 24 Jun 2012