26
52

Towards Understanding and Improving GFlowNet Training

Abstract

Generative flow networks (GFlowNets) are a family of algorithms that learn a generative policy to sample discrete objects xx with non-negative reward R(x)R(x). Learning objectives guarantee the GFlowNet samples xx from the target distribution p(x)R(x)p^*(x) \propto R(x) when loss is globally minimized over all states or trajectories, but it is unclear how well they perform with practical limits on training resources. We introduce an efficient evaluation strategy to compare the learned sampling distribution to the target reward distribution. As flows can be underdetermined given training data, we clarify the importance of learned flows to generalization and matching p(x)p^*(x) in practice. We investigate how to learn better flows, and propose (i) prioritized replay training of high-reward xx, (ii) relative edge flow policy parametrization, and (iii) a novel guided trajectory balance objective, and show how it can solve a substructure credit assignment problem. We substantially improve sample efficiency on biochemical design tasks.

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.