Unified Structural Factors for Transfer Learning Generalization with PAC-Bayesian Guarantees
DOI:
https://doi.org/10.1609/aaai.v40i25.39270Abstract
Understanding when a pre-trained model generalizes well to a new task remains a key challenge in transfer learning. Classical theories bound target risk using divergences such as total variation, MMD, or Wasserstein distance, yet tasks with similar divergences often show very different transfer performance. We propose a structural framework that explains transferability through two factors: the Feature Overlap Rate (FOR), measuring how much target representation lies in the source-induced subspace, and the Effective Task Complexity (ETC), quantifying the entropy of latent subtasks. We derive a PAC-Bayesian bound where target risk depends on FOR and ETC, and show that larger models attenuate their negative effects. Experiments on six GLUE transfer pairs estimate FOR and ETC from encoder representations and compare them to classical divergences. Results show that FOR and ETC together explain over 80% of transfer risk variance, while divergences fail to do so. Our findings provide a geometry-aware perspective for diagnosing and guiding transfer learning.Downloads
Published
2026-03-14
How to Cite
Gao, Z. (2026). Unified Structural Factors for Transfer Learning Generalization with PAC-Bayesian Guarantees. Proceedings of the AAAI Conference on Artificial Intelligence, 40(25), 21252–21259. https://doi.org/10.1609/aaai.v40i25.39270
Issue
Section
AAAI Technical Track on Machine Learning II