Distilling Structured Rationale from Large Language Models to Small Language Models for Abstractive Summarization

Authors

  • Linyong Wang Northwestern Polytechnical University, Xi’an
  • Lianwei Wu Northwestern Polytechnical University, Xi’an
  • Shaoqi Song Northwestern Polytechnical University, Xi’an
  • Yaxiong Wang Hefei University of Technology, Hefei
  • Cuiyun Gao Harbin Institute of Technology, Shenzhen
  • Kang Wang Northwestern Polytechnical University, Xi’an

DOI:

https://doi.org/10.1609/aaai.v39i24.34727

Abstract

Large Language Models (LLMs) have permeated various Natural Language Processing (NLP) tasks. For the summarization tasks, LLMs can generate well-structured rationales, which consist of Essential Aspects (EA), Associated Sentences (AS) and Triple Entity Relations (TER). These rationales guide smaller models (≤1B) to produce better summaries. However, their high deployment costs (≥70B), such as substantial storage space and high computing requirements, limit their utilization in resource-constrained environments. Furthermore, effectively distilling these structured rationales from LLMs into Small Language Models (SLMs) models remains a challenge. To address this, we propose the LLM-based Structured Rationale-guided Multi-view Weak-gated Fusion framework (LSR-MWF). The framework initially employs LLMs to dig structural rationales from a document, considering multiple viewpoints such as EA, AS, and TER. Then, it develop a multi-step summary generation evaluation strategy to select high-quality structured rationales. Subsequently, it aligns with these rationales using additional modules organized in a hierarchical structure. Finally, the framework integrates the features output by these modules with original abstractive model through a weak-gated mechanism. Experimental results on two publicly available CNN/DailyMail and XSum datasets show that our method improves the performance of the abstractive model, outperforming baselines by 11.2% and 5.8%, respectively. In addition, our method improves the interpretability of summary generation from the viewpoints of EA, AS and TER.

Published

2025-04-11

How to Cite

Wang, L., Wu, L., Song, S., Wang, Y., Gao, C., & Wang, K. (2025). Distilling Structured Rationale from Large Language Models to Small Language Models for Abstractive Summarization. Proceedings of the AAAI Conference on Artificial Intelligence, 39(24), 25389-25397. https://doi.org/10.1609/aaai.v39i24.34727

Issue

Section

AAAI Technical Track on Natural Language Processing III