Unified Structural Factors for Transfer Learning Generalization with PAC-Bayesian Guarantees

Ziqi Gao

doi:10.1609/aaai.v40i25.39270

Authors

Ziqi Gao University of Pennsylvania, Philadelphia, PA

DOI:

https://doi.org/10.1609/aaai.v40i25.39270

Abstract

Understanding when a pre-trained model generalizes well to a new task remains a key challenge in transfer learning. Classical theories bound target risk using divergences such as total variation, MMD, or Wasserstein distance, yet tasks with similar divergences often show very different transfer performance. We propose a structural framework that explains transferability through two factors: the Feature Overlap Rate (FOR), measuring how much target representation lies in the source-induced subspace, and the Effective Task Complexity (ETC), quantifying the entropy of latent subtasks. We derive a PAC-Bayesian bound where target risk depends on FOR and ETC, and show that larger models attenuate their negative effects. Experiments on six GLUE transfer pairs estimate FOR and ETC from encoder representations and compare them to classical divergences. Results show that FOR and ETC together explain over 80% of transfer risk variance, while divergences fail to do so. Our findings provide a geometry-aware perspective for diagnosing and guiding transfer learning.

Unified Structural Factors for Transfer Learning Generalization with PAC-Bayesian Guarantees

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information