What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation

9 August 2020

Papers citing "What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation"

50 / 369 papers shown

Title
Can Neural Network Memorization Be Localized? Pratyush Maini Michael C. Mozer Hanie Sedghi Zachary Chase Lipton J. Zico Kolter Chiyuan Zhang TDI 55 52 0 18 Jul 2023
Memorization Through the Lens of Curvature of Loss Function Around Samples Isha Garg Deepak Ravikumar Kaushik Roy TDI 41 13 0 11 Jul 2023
Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation Zhexin Zhang Jiaxin Wen Minlie Huang 38 35 0 10 Jul 2023
T-MARS: Improving Visual Representations by Circumventing Text Feature Learning Pratyush Maini Sachin Goyal Zachary Chase Lipton J. Zico Kolter Aditi Raghunathan VLM 70 34 0 06 Jul 2023
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses G. Buzaglo Niv Haim Gilad Yehudai Gal Vardi Yakir Oz Yaniv Nikankin Michal Irani 67 12 0 04 Jul 2023
Tools for Verifying Neural Models' Training Data Dami Choi Yonadav Shavit David Duvenaud MIALM 60 16 0 02 Jul 2023
Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD Anvith Thudi Hengrui Jia Casey Meehan Ilia Shumailov Nicolas Papernot 82 6 0 01 Jul 2023
OpenDataVal: a Unified Benchmark for Data Valuation Kevin Jiang Weixin Liang James Zou Yongchan Kwon FedML 92 37 0 18 Jun 2023
Achilles' Heels: Vulnerable Record Identification in Synthetic Data Publishing Matthieu Meeus Florent Guépin Ana-Maria Cretu Yves-Alexandre de Montjoye 128 24 0 17 Jun 2023
Evaluating Data Attribution for Text-to-Image Models Sheng-Yu Wang Alexei A. Efros Jun-Yan Zhu Richard Y. Zhang TDI 69 33 0 15 Jun 2023
Understanding the Effect of the Long Tail on Neural Network Compression Harvey Dam Vinu Joseph Aditya Bhaskara G. Gopalakrishna Saurav Muralidharan M. Garland 67 2 0 09 Jun 2023
TMI! Finetuned Models Leak Private Information from their Pretraining Data John Abascal Stanley Wu Alina Oprea Jonathan R. Ullman 77 17 0 01 Jun 2023
Representer Point Selection for Explaining Regularized High-dimensional Models Che-Ping Tsai Jiong Zhang Eli Chien Hsiang-Fu Yu Cho-Jui Hsieh Pradeep Ravikumar 48 2 0 31 May 2023
Feature Collapse T. Laurent J. V. Brecht Xavier Bresson 55 3 0 25 May 2023
Training Data Extraction From Pre-trained Language Models: A Survey Shotaro Ishihara 100 48 0 25 May 2023
How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features Simone Bombari Marco Mondelli AAML 63 5 0 20 May 2023
Meta-Optimization for Higher Model Generalizability in Single-Image Depth Prediction Cho-Ying Wu Yiqi Zhong Junying Wang Ulrich Neumann MDE 61 5 0 12 May 2023
An Evaluation on Large Language Model Outputs: Discourse and Memorization Adrian de Wynter Xun Wang Alex Sokolov Qilong Gu Si-Qing Chen ELM 115 34 0 17 Apr 2023
Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value Yongchan Kwon James Zou TDI FedML 64 37 0 16 Apr 2023
Do We Train on Test Data? The Impact of Near-Duplicates on License Plate Recognition Rayson Laroca Valter Estevam A. Britto Rodrigo Minetto David Menotti 37 11 0 10 Apr 2023
Foundation Models and Fair Use Peter Henderson Xuechen Li Dan Jurafsky Tatsunori Hashimoto Mark A. Lemley Percy Liang 73 123 0 28 Mar 2023
TRAK: Attributing Model Behavior at Scale Sung Min Park Kristian Georgiev Andrew Ilyas Guillaume Leclerc Aleksander Madry TDI 89 154 0 24 Mar 2023
Fairness Improves Learning from Noisily Labeled Long-Tailed Data Jiaheng Wei Zhaowei Zhu Gang Niu Tongliang Liu Sijia Liu Masashi Sugiyama Yang Liu 44 6 0 22 Mar 2023
SemDeDup: Data-efficient learning at web-scale through semantic deduplication Amro Abbas Kushal Tirumala Daniel Simig Surya Ganguli Ari S. Morcos 69 175 0 16 Mar 2023
FAIR-Ensemble: When Fairness Naturally Emerges From Deep Ensembling Wei-Yin Ko Daniel D'souza Karina Nguyen Randall Balestriero Sara Hooker FedML 57 11 0 01 Mar 2023
Internet Explorer: Targeted Representation Learning on the Open Web Alexander C. Li Ellis L Brown Alexei A. Efros Deepak Pathak VLM 51 26 0 27 Feb 2023
Make Every Example Count: On the Stability and Utility of Self-Influence for Learning from Noisy NLP Datasets Irina Bejan Artem Sokolov Katja Filippova TDI 79 11 0 27 Feb 2023
Beyond Distribution Shift: Spurious Features Through the Lens of Training Dynamics Nihal Murali A. Puli Ke Yu Rajesh Ranganath Kayhan Batmanghelich AAML 75 10 0 18 Feb 2023
Pushing the Accuracy-Group Robustness Frontier with Introspective Self-play J. Liu Krishnamurthy Dvijotham Jihyeon Janel Lee Quan Yuan Martin Strobel Balaji Lakshminarayanan Deepak Ramachandran 47 5 0 11 Feb 2023
ResMem: Learn what you can and memorize the rest Zitong Yang Michal Lukasik Vaishnavh Nagarajan Zong-xiao Li A. S. Rawat Manzil Zaheer A. Menon Surinder Kumar VLM 73 8 0 03 Feb 2023
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation Noel Loo Ramin Hasani Mathias Lechner Alexander Amini Daniela Rus DD 76 6 0 02 Feb 2023
Pathologies of Predictive Diversity in Deep Ensembles Taiga Abe E. Kelly Buchanan Geoff Pleiss John P. Cunningham UQCV 117 14 0 01 Feb 2023
Learning Large-scale Neural Fields via Context Pruned Meta-Learning Jihoon Tack Subin Kim Sihyun Yu Jaeho Lee Jinwoo Shin Jonathan Richard Schwarz 65 9 0 01 Feb 2023
Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality Verna Dankers Ivan Titov 59 2 0 31 Jan 2023
Extracting Training Data from Diffusion Models Nicholas Carlini Jamie Hayes Milad Nasr Matthew Jagielski Vikash Sehwag Florian Tramèr Borja Balle Daphne Ippolito Eric Wallace DiffM 124 606 0 30 Jan 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum Emmanuel Abbe Samy Bengio Aryo Lotfi Kevin Rizk LRM 85 54 0 30 Jan 2023
Data Valuation Without Training of a Model Nohyun Ki Hoyong Choi Hye Won Chung TDI 61 32 0 03 Jan 2023
Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining Florian Tramèr Gautam Kamath Nicholas Carlini SILM 73 73 0 13 Dec 2022
Training Data Influence Analysis and Estimation: A Survey Zayd Hammoudeh Daniel Lowd TDI 70 96 0 09 Dec 2022
Leveraging Unlabeled Data to Track Memorization Mahsa Forouzesh Hanie Sedghi Patrick Thiran NoLa TDI 66 4 0 08 Dec 2022
Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models Gowthami Somepalli Vasu Singla Micah Goldblum Jonas Geiping Tom Goldstein 73 324 0 07 Dec 2022
Text Embeddings by Weakly-Supervised Contrastive Pre-training Liang Wang Nan Yang Xiaolong Huang Binxing Jiao Linjun Yang Daxin Jiang Rangan Majumder Furu Wei VLM 239 601 0 07 Dec 2022
Neural Representations Reveal Distinct Modes of Class Fitting in Residual Convolutional Networks Michal Jamro.z Marcin Kurdziel 34 0 0 01 Dec 2022
On Pitfalls of Measuring Occlusion Robustness through Data Distortion Antonia Marcu 46 0 0 24 Nov 2022
ModelDiff: A Framework for Comparing Learning Algorithms Harshay Shah Sung Min Park Andrew Ilyas Aleksander Madry SyDa 84 29 0 22 Nov 2022
What Images are More Memorable to Machines? Junlin Han Huangying Zhan Jie Hong Pengfei Fang Hongdong Li L. Petersson Ian Reid 60 3 0 14 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation Cody Blakeney Jessica Zosa Forde Jonathan Frankle Ziliang Zong Matthew L. Leavitt VLM 71 5 0 01 Nov 2022
Characterizing Datapoints via Second-Split Forgetting Pratyush Maini Saurabh Garg Zachary Chase Lipton J. Zico Kolter 62 34 0 26 Oct 2022
The Curious Case of Benign Memorization Sotiris Anagnostidis Gregor Bachmann Lorenzo Noci Thomas Hofmann AAML 95 10 0 25 Oct 2022
Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks Vikas Raunak Arul Menezes 67 13 0 24 Oct 2022