Difficulty-Aware Learning Curve Extrapolation
DOI:
https://doi.org/10.1609/aaai.v40i27.39467Abstract
Learning Curve Extrapolation (LCE) is a critical technique for accelerating automated machine learning by terminating unpromising training runs early. Recent state-of-the-art methods have improved predictive accuracy by incorporating contextual information, such as neural network architecture. However, these approaches, whether context-agnostic or architecture-aware, still operate under the implicit assumption of a uniform task landscape. They overlook a pivotal, complementary factor: the intrinsic difficulty of the learning task itself. This oversight leads to significant performance degradation, especially for tasks whose learning dynamics diverge from the model's priors. In this work, we argue that task difficulty is a crucial yet neglected dimension for robust LCE. We introduce Difficulty-Aware Learning Curve Extrapolation (DA-LCE), which explicitly conditions its predictions on task complexity. Our core contributions are threefold: (1) We propose a transparent, rule-based method to quantify task difficulty from early learning curve dynamics, eliminating the need for external meta-features. (2) We design a novel data generation pipeline using conditional diffusion models to create high-fidelity, difficulty-conditioned synthetic training data. (3) We introduce a Transformer-based predictor that leverages difficulty information to achieve superior accuracy across diverse benchmarks. Extensive experiments demonstrate that our approach significantly outperforms both difficulty-agnostic and architecture-aware baselines, with task difficulty emerging as a powerful conditioning signal whose impact matches or exceeds that of model architecture.Downloads
Published
2026-03-14
How to Cite
Li, M., & Zhao, P. (2026). Difficulty-Aware Learning Curve Extrapolation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(27), 23021-23029. https://doi.org/10.1609/aaai.v40i27.39467
Issue
Section
AAAI Technical Track on Machine Learning IV