We revisit the Online Convex Optimization problem with adversarial constraints (COCO) where, in each round, a learner is presented with a convex cost function and a convex constraint function, both of which may be chosen adversarially. The learner selects actions from a convex decision set in an online fashion, with the goal of minimizing both regret and the cumulative constraint violation (CCV) over a horizon of rounds. The best-known policy for this problem achieves regret and CCV. In this paper, we present a surprising improvement that achieves a significantly smaller CCV by trading it off with regret. Specifically, for any bounded convex cost and constraint functions, we propose an online policy that achieves regret and CCV, where is the dimension of the decision set and is a tunable parameter. We achieve this result by first considering the special case of problem where the decision set is a probability simplex and the cost and constraint functions are linear. Leveraging a new adaptive small-loss regret bound, we propose an efficient policy for the problem, that attains regret and CCV, where is the number of experts. The original problem is then reduced to the problem via a covering argument. Finally, with an additional smoothness assumption, we propose an efficient gradient-based policy attaining regret and CCV.
View on arXiv@article{sinha2025_2505.06709, title={ Beyond $\tilde{O}(\sqrt{T})$ Constraint Violation for Online Convex Optimization with Adversarial Constraints }, author={ Abhishek Sinha and Rahul Vaze }, journal={arXiv preprint arXiv:2505.06709}, year={ 2025 } }