ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.02036
48
10
v1v2 (latest)

Maximum Entropy Distributions: Bit Complexity and Stability

6 November 2017
D. Straszak
Nisheeth K. Vishnoi
ArXiv (abs)PDFHTML
Abstract

Maximum entropy distributions with discrete support in mmm dimensions arise in machine learning, statistics, information theory, and theoretical computer science. While structural and computational properties of max-entropy distributions have been extensively studied, basic questions such as: Do max-entropy distributions over a large support (e.g., 2m2^m2m) with a specified marginal vector have succinct descriptions (polynomial-size in the input description)? and: Are entropy maximizing distributions "stable" under the perturbation of the marginal vector? have resisted a rigorous resolution. Here we show that these questions are related and resolve both of them. Our main result shows a poly(m,log⁡1/ε){\rm poly}(m, \log 1/\varepsilon)poly(m,log1/ε) bound on the bit complexity of ε\varepsilonε-optimal dual solutions to the maximum entropy convex program -- for very general support sets and with no restriction on the marginal vector. Applications of this result include polynomial time algorithms to compute max-entropy distributions over several new and old polytopes for any marginal vector in a unified manner, a polynomial time algorithm to compute the Brascamp-Lieb constant in the rank-1 case. The proof of this result allows us to show that changing the marginal vector by δ\deltaδ changes the max-entropy distribution in the total variation distance roughly by a factor of poly(m,log⁡1/δ)δ{\rm poly}(m, \log 1/\delta)\sqrt{\delta}poly(m,log1/δ)δ​ -- even when the size of the support set is exponential. Together, our results put max-entropy distributions on a mathematically sound footing -- these distributions are robust and computationally feasible models for data.

View on arXiv
Comments on this paper