GrayKD: Distilling Better Knowledge from Black-box LLM via Multi-rationale Injection

Authors

  • Hyeongsoo Lim Chung-Ang University
  • Hyung Yong Kim 42dot
  • Jin Young Kim Chung-Ang University
  • Min Ho Jang Chung-Ang University
  • Eun Seo Seo Chung-Ang University
  • Youshin Lim 42dot
  • Shukjae Choi 42dot
  • Jihwan Park 42dot
  • Yunkyu Lim 42dot
  • Hanbin Lee 42dot
  • Byeong-Yeol Kim 42dot
  • Ji Won Yoon Chung-Ang University

DOI:

https://doi.org/10.1609/aaai.v40i38.40470

Abstract

Knowledge distillation (KD) is a promising compression technique for reducing the computational burden of large language models (LLMs). Depending on access to the teacher model’s internal parameters, KD is typically categorized into white-box and black-box KD. While white-box KD benefits from full access to intrinsic knowledge such as softmax distributions, black-box KD adopts a black-box LLM (e.g., GPT-4) as the teacher, which provides only text-level outputs via API calls. This limited supervision makes black-box KD generally less effective than its white-box counterpart. To bridge the gap between white-box and black-box KD, we propose GrayKD, a novel framework that can effectively distill text-level knowledge from a black-box LLM in a single-stage manner. In particular, rationales generated by the black-box LLM are injected into the student via a lightweight cross-attention module (teacher mode), enabling the model to approximate the black-box teacher’s output distribution without access to internal parameters. The student is then trained with the softmax-level knowledge provided by the teacher mode (student mode). Since both the teacher and student modes share the same backbone, the proposed teacher mode remains highly parameter-efficient, requiring only a small number of additional parameters for rationale injection. Experimental results on instruction-following tasks demonstrate that GrayKD achieves substantial performance improvements over existing KD methods.

Downloads

Published

2026-03-14

How to Cite

Lim, H., Kim, H. Y., Kim, J. Y., Jang, M. H., Seo, E. S., Lim, Y., … Yoon, J. W. (2026). GrayKD: Distilling Better Knowledge from Black-box LLM via Multi-rationale Injection. Proceedings of the AAAI Conference on Artificial Intelligence, 40(38), 31997–32005. https://doi.org/10.1609/aaai.v40i38.40470

Issue

Section

AAAI Technical Track on Natural Language Processing III