Scaling-Up Robust Gradient Descent Techniques

Matthew J. Holland

doi:10.1609/aaai.v35i9.16940

Scaling-Up Robust Gradient Descent Techniques

Authors

Matthew J. Holland Osaka University

DOI:

https://doi.org/10.1609/aaai.v35i9.16940

Keywords:

Learning Theory

Abstract

We study a scalable alternative to robust gradient descent (RGD) techniques that can be used when losses and/or gradients can be heavy-tailed, though this will be unknown to the learner. The core technique is simple: instead of trying to robustly aggregate gradients at each step, which is costly and leads to sub-optimal dimension dependence in risk bounds, we choose a candidate which does not diverge too far from the majority of cheap stochastic sub-processes run over partitioned data. This lets us retain the formal strength of RGD methods at a fraction of the cost.

Downloads

Published

2021-05-18

How to Cite

Holland, M. J. (2021). Scaling-Up Robust Gradient Descent Techniques. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9), 7694-7701. https://doi.org/10.1609/aaai.v35i9.16940

Download Citation

Issue

Vol. 35 No. 9: AAAI-21 Technical Tracks 9

Section

AAAI Technical Track on Machine Learning II

Scaling-Up Robust Gradient Descent Techniques

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription