Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilities

Authors

  • Weixiang Zhao Harbin Institute of Technology
  • Xingyu Sui Harbin Institute of Technology
  • Jiahe Guo Harbin Institute of Technology
  • Yulin Hu Harbin Institute of Technology
  • Yang Deng Singapore Management University
  • Yanyan Zhao Harbin Institute of Technology
  • Xuda Zhi SERES
  • Yongbo Huang SERES
  • Hao He SERES
  • Wanxiang Che Harbin Institute of Technology
  • Ting Liu Harbin Institute of Technology
  • Bing Qin Harbin Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v40i41.40802

Abstract

Recent advancements in Large Reasoning Models (LRMs), such as OpenAI's o1/o3 and DeepSeek-R1, have demonstrated remarkable performance in specialized reasoning tasks through human-like deliberative thinking and long chain-of-thought reasoning. However, our systematic evaluation across various model families (DeepSeek, Qwen, and LLaMA) and scales (7B to 32B) reveals that acquiring these deliberative reasoning capabilities significantly reduces the foundational capabilities of LRMs, including notable declines in helpfulness and harmlessness, alongside substantially increased inference costs. Importantly, we demonstrate that adaptive reasoning---employing modes like Zero-Thinking, Less-Thinking, and Summary-Thinking---can effectively alleviate these drawbacks. Our empirical insights underline the critical need for developing more versatile LRMs capable of dynamically allocating inference-time compute according to specific task characteristics.

Downloads

Published

2026-03-14

How to Cite

Zhao, W., Sui, X., Guo, J., Hu, Y., Deng, Y., Zhao, Y., … Qin, B. (2026). Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilities. Proceedings of the AAAI Conference on Artificial Intelligence, 40(41), 34976–34984. https://doi.org/10.1609/aaai.v40i41.40802

Issue

Section

AAAI Technical Track on Natural Language Processing VI