The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks

Wanfu Gao; Zebin He; Jun Gao

doi:10.1609/aaai.v40i25.39265

Authors

Wanfu Gao Jilin University
Zebin He Jilin University
Jun Gao Jilin University

DOI:

https://doi.org/10.1609/aaai.v40i25.39265

Abstract

Existing feature engineering methods based on large language models (LLMs) have not yet been applied to multi-label learning tasks. They lack the ability to model complex label dependencies and are not specifically adapted to the characteristics of multi-label tasks. To address the above issues, we propose Feature Engineering Automation for Multi-Label Learning (FEAML), an automated feature engineering method for multi-label classification which leverages the code generation capabilities of LLMs. By utilizing metadata and label co-occurrence matrices, LLMs are guided to understand the relationships between data features and task objectives, based on which high-quality features are generated.The newly generated features are evaluated in terms of model accuracy to assess their effectiveness, while Pearson correlation coefficients are used to detect redundancy. FEAML further incorporates the evaluation results as feedback to drive LLMs to continuously optimize code generation in subsequent iterations. By integrating LLMs with a feedback mechanism, FEAML realizes an efficient, interpretable and self-improving feature engineering paradigm. Empirical results on various multi-label datasets demonstrate that our FEAML outperforms other feature engineering methods.

The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information