329

Estimating a Directed Tree for Extremes

Abstract

The Extremal River Problem has emerged as a flagship problem for causal discovery in extreme values of a network. The task is to recover a river network from only extreme flow measured at a set VV of stations, without any information on the stations' locations. We present QTree, a new simple and efficient algorithm to solve the Extremal River Problem that performs very well compared to existing methods on hydrology data and in simulations. QTree returns a root-directed tree and achieves almost perfect recovery on the Upper Danube network data, the existing benchmark data set, as well as on new data from the Lower Colorado River network in Texas. It can handle missing data, has an automated parameter tuning procedure, and runs in time O(nV2)O(n |V|^2), where nn is the number of observations and V|V| the number of nodes in the graph. Furthermore, we prove that the QTree estimator is consistent under a Bayesian network model for extreme values with noise. We also assess the small sample behaviour of QTree through simulations and detail the strengths and possible limitations of QTree.

View on arXiv
Comments on this paper