The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks

Authors

  • Wanfu Gao Jilin University
  • Zebin He Jilin University
  • Jun Gao Jilin University

DOI:

https://doi.org/10.1609/aaai.v40i25.39265

Abstract

Existing feature engineering methods based on large language models (LLMs) have not yet been applied to multi-label learning tasks. They lack the ability to model complex label dependencies and are not specifically adapted to the characteristics of multi-label tasks. To address the above issues, we propose Feature Engineering Automation for Multi-Label Learning (FEAML), an automated feature engineering method for multi-label classification which leverages the code generation capabilities of LLMs. By utilizing metadata and label co-occurrence matrices, LLMs are guided to understand the relationships between data features and task objectives, based on which high-quality features are generated.The newly generated features are evaluated in terms of model accuracy to assess their effectiveness, while Pearson correlation coefficients are used to detect redundancy. FEAML further incorporates the evaluation results as feedback to drive LLMs to continuously optimize code generation in subsequent iterations. By integrating LLMs with a feedback mechanism, FEAML realizes an efficient, interpretable and self-improving feature engineering paradigm. Empirical results on various multi-label datasets demonstrate that our FEAML outperforms other feature engineering methods.

Published

2026-03-14

How to Cite

Gao, W., He, Z., & Gao, J. (2026). The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks. Proceedings of the AAAI Conference on Artificial Intelligence, 40(25), 21207–21215. https://doi.org/10.1609/aaai.v40i25.39265

Issue

Section

AAAI Technical Track on Machine Learning II