-
Notifications
You must be signed in to change notification settings - Fork 22
Description
@sandlbn and myself would like to propose a new SIG based on the work we've been doing on ML model lifecycle provenance and transparency. We'd welcome feedback on our proposal and interested participants!
Creation of a new Special Interest Group (SIG) at Sandbox stage
End-to-End (E2E) Model Lifecycle Provenance
Proposed focus, intent, goals, and/or deliverables
Focus / Mission
Model signing uses digital signatures as an effective way to detect tampering of ML models after publication to model hubs, but it does not track model transformations at each lifecycle stage before or after the model is published. To capture ML model provenance across its entire lifecycle, we must address several challenges:
- Heterogeneous pipelines: Each model lifecycle stage runs an ML pipeline using different software dependencies, ML frameworks, execution environments and target platforms. This requires a way to capture heterogeneous information about pipeline configuration, inputs and outputs.
- Cross-pipeline tracking: Since each lifecycle stage involves different stakeholders, end-to-end provenance must link each individual stage across pipelines in a way that allows for pipeline order and operational metadata to be cryptographically validated.
- Pipeline integration: Collecting information about pipeline operations and systems requires tight integration with the pipelines, increasing the cost to adoption.
This SIG addresses these challenges by developing a pipeline-agnostic framework for attesting and validating end-to-end model lifecycle provenance. The work in this SIG intends to build upon the Atlas framework for ML lifecycle provenance and transparency and its implementation in the Atlas CLI, which currently supports OMS-compliant C2PA metadata and standard Intel TDX hardware attestation.
Goals
- Enable pipeline-agnostic attestation of any model transformation that occurs throughout a model’s lifecycle.
- Facilitate model producer and consumer validation of a model’s attested lineage in order to detect unintended/malicious changes to the expected stages of the lifecycle (e.g., pipelines operating out of order, or being omitted), from initial data processing through model deployment and further refinement.
- Explore customizable pipeline metadata collection that incrementally addresses various security use cases (e.g., SLSA Build Provenance, additional pipeline run-time logging, fine-grained pipeline software stack, and/or pipeline compute environment attestation)
- Evaluate different trusted execution environments (TEE) for hardware-based hardening and attestation of model pipelines at run-time.
Deliverables
- A pipeline-agnostic data format specification for minimum required metadata for model lifecycle provenance.
- An OMS-compliant APl for collecting customizable pipeline metadata in a standardized attestation format, including optional vendor-agnostic TEE hardware enablement.
- A set of in-toto compliant templates for common end-to-end model lifecycle integrity validation policies.
- An OMS-compliant API for consuming and validating end-to-end lifecycle attestations based on a given policy.
- Talk at 2026 OSS conference (e.g., Open Source Summit, Open Source SecurityCon)
- Stretch: Prototype implementations of end-to-end ML lifecycle provenance APIs for common pipeline frameworks (e.g., KubeFlow)
Success Metrics
- Library and tools enable full support of SLSA Build track v1.0 and support for at least two additional levels of incremental pipeline provenance collection
- Library and tools support GPU-based model artifact hashing
- Stretch goal: Libraries provide vendor-agnostic support at least 2 TEE hardware configurations
- Stretch goal: Prototypes available 2 common pipeline frameworks
2026 Roadmap
| Quarter | Milestone |
|---|---|
| Q1 2026 | Release v0.1 API specification of E2E model lifecycle attestation and validation, including provenance metadata format specification, and key use cases |
| Q2 2026 | Release v1.0 library and tools for E2E lifecycle provenance expanding upon Atlas CLI v0.2 with support for: v0.1 API spec All levels of the SLSA Build track v1.0 Optional pipeline run-time logging Attestation storage in Rekor Provenance validation based on a select in-toto compliant policies Release of v0.1 prototype implementation for KubeFlow pipeline integration |
| Q3 2026 | Deliver talk at industry conference; Release v1.0 API specification of E2E model lifecycle attestation and validation for configurable provenance |
| Q4 2026 | Release v1.1 library and tools for E2E lifecycle provenance with support for: v1.0 API spec Optional finer-grained pipeline software stack attestation Integration with in-toto compliant policy engine ingesting custom policies Configurable, vendor-agnostic TEE hardware attestation validation Release of v0.2 prototype implementation for second pipeline framework (TBD) |
Future Directions
- Provenance attestation beyond pipeline operations (e.g., via SLSA Source track)
- Model lifecycle metadata collection of higher-order model attributes (e.g., dataset provenance, model cards)
List SIG Lead(s)
The SIG must have a minimum of 1 Lead
- Marcela Melara, Intel, marcelamelara
- Marcin Spoczynski, Intel, sandlbn
List of interested individuals
The SIG have a minimum of 3 members with 2 different organizational affiliations.
- Mihai Maruseac, Google, mihaimaruseac
- Abdullah Garcia, JP Morgan, abdullahgarcia
- TBD
Governing Body
SIGs may report to an existing OpenSSF Working Group or directly to the TAC as their governing body. The SIG commits to providing the governing body quarterly updates on progress.
- "AI/ML Security WG”
SIG References
| Reference | URL |
|---|---|
| Repo | https://github.com/IntelLabs/atlas-cli |
| Security.md | https://github.com/IntelLabs/atlas-cli/blob/main/SECURITY.md |
| code-of-conduct.md | https://github.com/ossf/ai-ml-security/blob/main/code-of-conduct.md |
| Demos | Planned for Q2/Q3 of 2026 |
| Paper | https://arxiv.org/pdf/2502.19567 |
| Open Source Summit NA ‘25 Talk | https://www.youtube.com/watch?v=FNpkbOOghe4 |