Hypothesis-Driven Reasoning for Large Language Models

Authors

  • Aakash Kumar Agarwal Indian Institute of Technology, Bombay, India
  • Moyuru Yamada Fujitsu Research of India, Bangalore, India

DOI:

https://doi.org/10.1609/aaai.v40i3.37146

Abstract

This paper tackles the fundamental failure of Large Language Models (LLMs) to solve new tasks when prompted with a sufficient, yet overly complex, set of multi-modal episodes. This failure stems from the model's inability to distill underlying patterns from the noisy experiences. We propose Hypothesis-Driven Reasoning (HDR), a framework that enhances LLM reasoning by building an explicit semantic memory—a set of hypotheses induced from the multi-modal episodes. HDR employs a two-stage pipeline. It first extracts potential factors from the episodes and then iteratively refines hypotheses by generate-verify loop with the factors. We first empirically demonstrates this failure and the potential of sematic memory, showing that oracle hypotheses can boost accuracy from 35.3% to 92.0% on a novel task we designed. We then evaluate our HDR, achieving near-oracle performance and significantly outperforming baselines, especially on smaller models. This paper validates a shift from unstructured in-context recall to explicit knowledge abstraction for robust reasoning.

Published

2026-03-14

How to Cite

Agarwal, A. K., & Yamada, M. (2026). Hypothesis-Driven Reasoning for Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(3), 1686–1693. https://doi.org/10.1609/aaai.v40i3.37146

Issue

Section

AAAI Technical Track on Cognitive Modeling & Cognitive Systems