SciDataMAS: LLM-Driven MAS for Scientific Data Management (Student Abstract)
DOI:
https://doi.org/10.1609/aaai.v40i48.42275Abstract
The management and annotation of complex, multi-modal scientific data remains a major obstacle for AI-driven research due to poor reusability and scalability of current solutions. We propose SciDataMAS, a novel LLM-powered multi-agent system (MAS), which automate scientific data management through a structured data lake with provenance-based organization and an adaptive metadata taxonomy. The system uses specialized workflows for automated dataset creation, data insertion and retrieval. Experiments show the system's proficiency, with modern LLMs like GPT-5 successfully generating rich metadata schemas and filling them with high accuracy. This work provides a foundational step towards fully automated, reusable, and scalable scientific data organization which may lead to generation and accumulation by scientific community well annotated AI-ready datasets.Downloads
Published
2026-03-14
How to Cite
Sachuk, A., Chukanov, V., & Pchitskaya, E. (2026). SciDataMAS: LLM-Driven MAS for Scientific Data Management (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41375–41376. https://doi.org/10.1609/aaai.v40i48.42275
Issue
Section
AAAI Student Abstract and Poster Program