Rademacher Complexity for Distributionally Robust Learning

Authors

  • Zhengyu Zhou Wuhan University
  • Weiwei Liu Wuhan University

DOI:

https://doi.org/10.1609/aaai.v40i34.40147

Abstract

The goal of distributionally robust learning is to learn models capable of performing well against distributional shifts, such as latent heterogeneous subpopulations, unknown covariate shifts, or unmodeled temporal effects. Recently, Duchi and Namkoong (2021) have proven an upper bound for the excess risk of distributionally robust learning through the lens of covering number argument. However, there are situations where the covering argument fails. This motivates us to study the generalization bound through the lens of Rademacher complexity. More specifically, we consider the Cressie-Read divergence, f sub k of t is proportional to t to the k minus one. Our theoretical results indicate that the excess risk is of the order big O sub P of n to the negative one over two k star, where k star equals k over k minus one. The decay rate of the excess risk increases with increasing k. As illustrative examples, we consider three learning settings: 1) linear classifier; 2) Gaussian reproducing kernel Hilbert space; 3) one-hidden-layer networks. The empirical results validate our theoretical findings.

Published

2026-03-14

How to Cite

Zhou, Z., & Liu, W. (2026). Rademacher Complexity for Distributionally Robust Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(34), 29098–29106. https://doi.org/10.1609/aaai.v40i34.40147

Issue

Section

AAAI Technical Track on Machine Learning XI