Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis--Hastings

4 June 2021

Papers citing "Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis--Hastings"

18 / 18 papers shown

Title
A distributional simplicity bias in the learning dynamics of transformers Riccardo Rende Federica Gerace Alessandro Laio Sebastian Goldt 81 8 0 17 Feb 2025
Reconstructing training data from document understanding models Jérémie Dentan Arnaud Paran A. Shabou AAML SyDa 54 1 0 05 Jun 2024
Principled Gradient-based Markov Chain Monte Carlo for Text Generation Li Du Afra Amini Lucas Torroba Hennigen Xinyan Velocity Yu Jason Eisner Holden Lee Ryan Cotterell BDL 26 1 0 29 Dec 2023
Discrete Distribution Networks Lei Yang 31 1 0 29 Dec 2023
Controlled Text Generation for Black-box Language Models via Score-based Progressive Editor Sangwon Yu Changmin Lee Hojin Lee Sungroh Yoon 29 0 0 13 Nov 2023
Amortizing intractable inference in large language models Marvin Schmitt Moksh Jain Daniel Habermann Younesse Kaddar Ullrich Kothe Stefan T. Radev Nikolay Malkin AIFin BDL 32 49 0 06 Oct 2023
MIMEx: Intrinsic Rewards from Masked Input Modeling Toru Lin Allan Jabri OffRL 31 6 0 15 May 2023
Planning with Sequence Models through Iterative Energy Minimization Hongyi Chen Yilun Du Yiye Chen J. Tenenbaum Patricio A. Vela 32 6 0 28 Mar 2023
Inconsistencies in Masked Language Models Tom Young Yunan Chen Yang You 22 2 0 30 Dec 2022
Converge to the Truth: Factual Error Correction via Iterative Constrained Editing Jiangjie Chen Rui Xu Wenyuan Zeng Changzhi Sun Lei Li Yanghua Xiao KELM 46 9 0 22 Nov 2022
Emergent Linguistic Structures in Neural Networks are Fragile Emanuele La Malfa Matthew Wicker Marta Kiatkowska 22 1 0 31 Oct 2022
Multi-segment preserving sampling for deep manifold sampler Daniel Berenberg Jae Hyeon Lee S. Kelow Ji Won Park Andrew Watkins Vladimir Gligorijević Richard Bonneau Stephen Ra Kyunghyun Cho MedIm 27 5 0 09 May 2022
Mix and Match: Learning-free Controllable Text Generation using Energy Language Models Fatemehsadat Mireshghallah Kartik Goyal Taylor Berg-Kirkpatrick 38 78 0 24 Mar 2022
Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks Fatemehsadat Mireshghallah Kartik Goyal Archit Uniyal Taylor Berg-Kirkpatrick Reza Shokri MIALM 32 152 0 08 Mar 2022
Probing BERT's priors with serial reproduction chains Takateru Yamakoshi Thomas Griffiths Robert D. Hawkins 29 12 0 24 Feb 2022
COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics Lianhui Qin Sean Welleck Daniel Khashabi Yejin Choi AI4CE 58 144 0 23 Feb 2022
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes Sam Bond-Taylor P. Hessey Hiroshi Sasaki T. Breckon Chris G. Willcocks DiffM 27 71 0 24 Nov 2021
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models Sam Bond-Taylor Adam Leach Yang Long Chris G. Willcocks VLM TPM 45 485 0 08 Mar 2021