Mixture-of-Trees: Learning to Select and Weigh Reasoning Paths for Efficient LLM Inference
DOI:
https://doi.org/10.1609/aaai.v40i40.40677Abstract
We introduce Mixture-of-Trees (MoT), a novel framework that integrates sparse expert activation with structured tree-based reasoning for efficient LLM inference. MoT employs a learned gating mechanism to selectively activate only the most relevant expert reasoning trees for each problem, where experts use models of varying capacities based on task complexity. The framework features three key innovations: (1) sparse expert activation through unified gating networks, (2) specialized expert trees that leverage domain-specific expertise while optimizing the quality-efficiency trade-off, and (3) collaborative debate mechanisms for conflicting solutions. Additionally, MoT includes a shared baseline tree with early stopping—activated experts perform lightweight validation and terminate early when confidence is high. Experiments across five benchmarks (GSM8K, MATH, AIME 2024, MMLU, HotpotQA) show that MoT achieves 2-7 percentage point accuracy improvements while reducing LLM calls by 37-40% compared to existing multi-path methods.Published
2026-03-14
How to Cite
Wei, Y., Huang, Z., Lu, S., Qian, J., Qin, D., Lin, T. J., … He, L. (2026). Mixture-of-Trees: Learning to Select and Weigh Reasoning Paths for Efficient LLM Inference. Proceedings of the AAAI Conference on Artificial Intelligence, 40(40), 33854–33862. https://doi.org/10.1609/aaai.v40i40.40677
Issue
Section
AAAI Technical Track on Natural Language Processing V