Outlier Matters: Efficient Long-to-Short Reasoning via Outlier-Guided Model Merging

Authors

  • Qiyuan Zhu Hong Kong University of Science and Technology
  • Dezhi Li The Hong Kong University of Science and Technology
  • Lujun Li Hong Kong University of Science and Technology
  • Xiaoyu Qin Tsinghua University
  • Wei Li University of Birmingham
  • Hao Gu The Hong Kong University of Science and Technology
  • Hua Xu Hong Kong University of Science and Technology
  • Sirui Han The Hong Kong University of Science and Technology
  • Yike Guo Hong Kong University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v40i41.40828

Abstract

Large Reasoning Language Models (LRMs) have recently shown remarkable performance in complex reasoning tasks, but their extensive reasoning chains incur substantial computational overhead. To address this challenge, we propose Outlier-aware Reasoning Conciseness Adaptive Merge (ORCA), a novel plug-and-play model merging framework that leverages outlier activation patterns to fuse base models with reasoning models. Our ORCA introduces three key innovations: (1) adaptive alignment that reduces conflicts between disparate activation patterns during merging, (2) outlier-guided allocation that assigns merging coefficients proportional to each layer's reasoning importance as indicated by outlier concentrations, and (3) dynamic probe-based adjustment that adapts merging coefficients during inference based on input-specific activation characteristics. These strategies allow seamless integration into existing merging pipelines while creating unified models that maintain reasoning accuracy with significantly reduced response verbosity. Comprehensive evaluation across six benchmarks using Qwen and LLaMA models shows ORCA reduces average response length by 55% while improving accuracy by 2.4∼5.7% over existing methods.

Downloads

Published

2026-03-14

How to Cite

Zhu, Q., Li, D., Li, L., Qin, X., Li, W., Gu, H., … Guo, Y. (2026). Outlier Matters: Efficient Long-to-Short Reasoning via Outlier-Guided Model Merging. Proceedings of the AAAI Conference on Artificial Intelligence, 40(41), 35213–35221. https://doi.org/10.1609/aaai.v40i41.40828

Issue

Section

AAAI Technical Track on Natural Language Processing VI