Leveraging First and Zeroth-Order Gradient to Address Imbalanced Black-Box Prompt Tuning via Minimax Optimization

Haozhen Zhang; Zhaogeng Liu; Bin Gu; Yi Chang

doi:10.1609/aaai.v39i21.34397

Authors

Haozhen Zhang School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
Zhaogeng Liu School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
Bin Gu School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
Yi Chang School of Artificial Intelligence, Jilin University, Changchun, Jilin, China International Center of Future Science, Jilin University, Changchun, Jilin, China Engineering Research Center of Knowledge-Driven Human-Machine Intelligence, MOE, Changchun, Jilin, China

DOI:

https://doi.org/10.1609/aaai.v39i21.34397

Abstract

Black-box prompt tuning has become a prevalent parameter-efficient paradigm that leverages the capabilities of large language models (LLMs) for customized applications in specific downstream tasks. In practical scenarios, downstream tasks frequently involve data distributions that are heavily imbalanced. Such imbalances tend to impair performance, causing severe performance collapse in minority classes. Conducting effective imbalanced black-box prompt tuning to mitigate the adverse effects of imbalanced data distribution on prompt performance remains a significant challenge. In this paper, we propose black-box prompt tuning with first and zeroth order gradient (BPT-FZG) for handling the imbalanced data. Specifically, BPT-FZG introduces AUC maximization as the objective for prompt tuning and equivalently formulates it as a nonconvex-concave saddle point problem to avoid the construction of sample pairs from opposite classes. Indeed, BPT-FZG optimizes the latent representation of the continuous prompt in the low-dimensional subspace with AUC loss and leverages the first and zeroth order gradients alternately to update the parameters. Furthermore, we establish the theoretical convergence guarantee for BPT-FZG under common assumptions, showing that our method can find a stationary point of the objective function. Our experiments on RoBERTa-large, GPT2-XL, and Llama3 show that BPT-FZG achieves improvement on various imbalanced datasets, emphasizing the effectiveness of our methods.

Leveraging First and Zeroth-Order Gradient to Address Imbalanced Black-Box Prompt Tuning via Minimax Optimization

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information