TIV: Thought Injection via Vectors for Efficient Reasoning in Large Reasoning Models
DOI:
https://doi.org/10.1609/aaai.v40i36.40264Abstract
Large Reasoning Models (LRMs) have recently demonstrated impressive performance across a range of reasoning tasks by generating intermediate thoughts. However, these models can suffer from overthinking—generating excessive tokens that contribute little to final accuracy while increasing inference cost. To mitigate this, we propose TIV (Thought Injection via Vectors), an innovative framework that compresses token-level reasoning into compact vectors without sacrificing performance. Rather than generating explicit thoughts, TIV injects learnable vectors into the post-attention hidden states of the final token across Transformer layers, enabling implicit and lightweight reasoning. We further introduce a two-stage reinforcement learning strategy: the first stage calibrates the model's reasoning distribution, and the second distills it into a vector-based policy optimized for both accuracy and brevity. Experiments on three reasoning benchmarks show that TIV preserves over 99% of the original accuracy while reducing output length by more than 65% on average, reaching up to 80% in some cases. Moreover, TIV consistently achieves superior trade-offs between accuracy and efficiency compared to existing methods, distinguishing itself as a state-of-the-art (SOTA) approach for efficient reasoning in LRMs.Downloads
Published
2026-03-14
How to Cite
Cao, Y., Shi, W., Xu, W.-J., Shen, Y., Cui, Y., Guo, H., … Xu, J. (2026). TIV: Thought Injection via Vectors for Efficient Reasoning in Large Reasoning Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(36), 30148–30155. https://doi.org/10.1609/aaai.v40i36.40264
Issue
Section
AAAI Technical Track on Natural Language Processing I