Automating Data Governance with Generative AI

Authors

  • Linus W. Dietz King's College London
  • Arif Wider Hochschule für Technik und Wirtschaft Berlin
  • Simon Harrer innoQ Deutschland GmbH Entropy Data GmbH

DOI:

https://doi.org/10.1609/aies.v8i1.36587

Abstract

The exchange of data within and between organizations is governed by company policies and data protection laws. As policies and data flows change over time, maintaining compliance in data exchange poses a complex challenge. In federated data architectures, validating data access requests is both critical and labor-intensive. To formalize this task and enable automatic compliance checks, rule-based constraint languages can be used. However, access constraints often come from legal texts, and translating them into formal data contracts is tedious, repetitive, and prone to error. This can lead to inconsistencies and delays in staying compliant with evolving regulations. To address this, we developed Governance AI, a tool based on a large language model (LLM) that evaluates data access requests by considering relevant policies, the type of data, and the request's context. To test our approach at scale, we built an access request generator and a testing framework for computational data governance. In our evaluation of 110 access requests from two business domains, e-commerce and life insurance, we found that LLM-generated test cases were highly realistic and effective for comprehensive testing. Governance AI demonstrated a stricter approach than human experts, issuing a higher number of warnings and consistently flagging all critical cases where experts raised data sharing concerns. While the tool generated 3.6 times more warnings than human experts, further review showed that 80% of these were accurate. Our findings contribute to the automation of data governance by critically assessing the potential of generative AI in evaluating data access requests regarding legislation and internal policies.

Downloads

Published

2025-10-15

How to Cite

Dietz, L. W., Wider, A., & Harrer, S. (2025). Automating Data Governance with Generative AI. Proceedings of the AAAI ACM Conference on AI, Ethics, and Society, 8(1), 760–771. https://doi.org/10.1609/aies.v8i1.36587