Policy-as-Code for AI Data Platforms: Enforcing Privacy, Lineage, and Access Control End-to-End
Abstract
AI data platforms increasingly blend batch and streaming pipelines, lakehouse storage, feature stores, and model serving into a single production surface area. This convergence increases governance complexity: privacy constraints must survive transformations, lineage must remain provable across heterogeneous tools, and access control must be consistent from raw ingestion through feature creation to model inference. This paper proposes an end-to-end Policy-as-Code (PaC) reference architecture for AI data platforms that unifies (i) privacy policy enforcement, (ii) lineage capture and validation, and (iii) authorization and access control. The approach treats governance as versioned, testable, and deployable code artifacts that are compiled into platform-specific enforcement points while preserving a single source of truth. We define a policy taxonomy aligned to data and ML lifecycles, specify enforcement placement across the data plane and control plane, and introduce verification hooks that prevent “policy drift” between CI pipelines and runtime. The proposed design supports regulated domains and high-assurance requirements by integrating privacy budgeting, provenance-aware controls, and modern authorization languages to achieve consistent governance at scale.
How to Cite This Article
Gunda Vamshi Krishna (2024). Policy-as-Code for AI Data Platforms: Enforcing Privacy, Lineage, and Access Control End-to-End . Global Multidisciplinary Perspectives Journal (GMPJ), 1(6), 228-232. DOI: https://doi.org/10.54660/GMPJ.2024.1.6.228-232