Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGES/984.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added attestations field to package upload that will create a PEP 740 Provenance object for that content.
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ include functest_requirements.txt
include test_requirements.txt
include unittest_requirements.txt
include pulp_python/app/webserver_snippets/*
include pulp_python/tests/functional/assets/*
exclude releasing.md
91 changes: 91 additions & 0 deletions docs/user/guides/attestation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Attestation Hosting (PEP 740)

Pulp Python has support for uploading attestations as originally specified in [PEP 740](https://peps.python.org/pep-0740/).
Attestations are stored in Pulp as Provenance Content that can be added/synced/removed from python
repositories. The provenance objects will be available through the Simple API and served by the
[Integrity API matching PyPI's implementation](https://docs.pypi.org/api/integrity/).

## Uploading Attestations

Attestations can be uploaded to Pulp with its package as a JSON list under the field `attestations`.

```bash
att=$(jq '[.]' twine-6.2.0.tar.gz.publish.attestation)
# multiple attestation files can be combined using --slurp and '.', jq --slurp '.' att1 att2 ...
http POST $PULP_API/pulp/api/v3/content/python/packages/ \
repository="$PYTHON_REPO_HREF" \
relative_path=twine-6.2.0.tar.gz \
artifact=$PACKAGE_ARTIFACT_PRN \
attestations:="$att"
```

The uploaded attestations can be found in the created Provenance object attached to the content in
the task report.

```json
// Task output abbreviated
{
"pulp_href": "/pulp/api/v3/tasks/019af033-c8e8-7a02-a583-0fac5e39e54b/",
"state": "completed",
"name": "pulpcore.app.tasks.base.general_create",
"created_resources": [
"/pulp/api/v3/content/python/provenance/019aeb59-34bb-7ae4-ab95-4f8a62199be9/",
"/pulp/api/v3/content/python/packages/019aeb59-34b1-7c73-a746-aea2cc3fbd85/"
],
"result": {
"prn": "prn:python.pythonpackagecontent:019aeb59-34b1-7c73-a746-aea2cc3fbd85",
"name": "twine",
"sha256": "418ebf08ccda9a8caaebe414433b0ba5e25eb5e4a927667122fbe8f829f985d8",,
"version": "6.2.0",
"artifact": "/pulp/api/v3/artifacts/019aeb59-33c3-7877-9787-22c34eb6c15b/",
"filename": "twine-6.2.0.tar.gz",
"pulp_href": "/pulp/api/v3/content/python/packages/019aeb59-34b1-7c73-a746-aea2cc3fbd85/",
// PRN of newly created Provenance object
"provenance": "prn:python.packageprovenance:019aeb59-34bb-7ae4-ab95-4f8a62199be9",
}
}
```

You can also use twine to upload your packages. Twine will find the attestations in files ending with
`.attestation` and attach them to the same filename during the upload. Pulp will then add the new
package and provenance object to the backing repository of the distribution.

```bash
pulp python distribution create --name foo --base-path foo --repository foo
pypi-attestations sign dist/twine-6.2.0.tar.gz dist/twine-6.2.0-py3-none-any.whl
twine upload --repository-url $PULP_API/pypi/foo/simple/ --attestations dist/*
```

## Interacting with Provenance Content

Provenance content can be directly uploaded to Pulp through its content endpoint.

```bash
http POST $PULP_API/pulp/api/v3/content/python/provenance/ --form \
file@twine.provenance \
package="$PACKAGE_PRN" \
repository="$REPO_PRN"
```

Provenance objects are artifactless content, their data is stored in a json field and are unique by
their sha256 digest. In a repository a provenance object is unique by their associated package, i.e
a package can only have one provenance in the repository at a time. Provenance objects can't be
modified after upload as content is immutable, but a new one can be uploaded to replace the existing
one. Since provenance objects are content they can be added, removed, and synced into repositories.
To sync provenance objects from an upstream repository set the `provenance` field on the remote.

```bash
http PATCH $PULP_API/$FOO_REMOTE_HREF provenance=true
pulp python repository sync --repository foo --remote foo
```

## Downloading Provenance objects

A package's provenance objects are exposed through its Simple page and downloaded from the Integrity
API. The attestations can then be verified using tools like `sigstore` or `pypi-attestations`.

```bash
http $PULP_API/pypi/foo/simple/twine/ "Accept:application/vnd.pypi.simple.v1+json" | jq -r ".files[].provenance"

http $PULP_API/pypi/foo/integrity/twine/6.2.0/twine-6.2.0.tar.gz/
```
1 change: 1 addition & 0 deletions docs/user/learn/tech-preview.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ The following features are currently being released as part of a tech preview
- Create pull-through caches of remote sources.
- Pulp Domain Support
- RBAC support
- PEP 740 attestations upload and provenance syncing/serving.
2 changes: 1 addition & 1 deletion pulp_python/app/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
)
from pulpcore.plugin.responses import ArtifactResponse

from pypi_attestations import Provenance
from pathlib import PurePath
from .provenance import Provenance
from .utils import (
artifact_to_python_content_data,
canonicalize_name,
Expand Down
71 changes: 71 additions & 0 deletions pulp_python/app/provenance.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
from typing import Annotated, Literal, Union, get_args

from pydantic import BaseModel, ConfigDict, Field
from pydantic.alias_generators import to_snake
from pypi_attestations import (
Attestation,
Distribution,
Publisher,
)


class _PermissivePolicy:
"""A permissive verification policy that always succeeds."""

def verify(self, cert):
"""Succeed regardless of the publisher's identity."""
pass


class AnyPublisher(BaseModel):
"""A fallback publisher for any kind not matching other publisher types."""

model_config = ConfigDict(alias_generator=to_snake, extra="allow")

kind: str

def _as_policy(self):
"""Return a permissive policy that always succeed."""
return _PermissivePolicy()


# Get the underlying Union type of the original Publisher
# Publisher is Annotated[Union[...], Field(discriminator="kind")]
_OriginalPublisherTypes = get_args(Publisher.__origin__)
# Add AnyPublisher to the list of original publisher types
_ExtendedPublisherTypes = (*_OriginalPublisherTypes, AnyPublisher)
_ExtendedPublisherUnion = Union[_ExtendedPublisherTypes]
# Create a new type that fallbacks to AnyPublisher
ExtendedPublisher = Annotated[_ExtendedPublisherUnion, Field(union_mode="left_to_right")]


class AttestationBundle(BaseModel):
"""
AttestationBundle object as defined in PEP740.

PyPI only accepts attestations from TrustedPublishers (GitHub, GitLab, Google), but we will
accept from any user.
"""

publisher: ExtendedPublisher
attestations: list[Attestation]


class Provenance(BaseModel):
"""Provenance object as defined in PEP740."""

version: Literal[1] = 1
attestation_bundles: list[AttestationBundle]


def verify_provenance(filename, sha256, provenance, offline=False):
"""Verify the provenance object is valid for the package."""
dist = Distribution(name=filename, digest=sha256)
for bundle in provenance.attestation_bundles:
publisher = bundle.publisher
policy = publisher._as_policy()
for attestation in bundle.attestations:
sig_bundle = attestation.to_bundle()
checkpoint = sig_bundle.log_entry._inner.inclusion_proof.checkpoint
staging = "sigstage.dev" in checkpoint.envelope
attestation.verify(policy, dist, staging=staging, offline=offline)
15 changes: 15 additions & 0 deletions pulp_python/app/pypi/serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
from gettext import gettext as _

from rest_framework import serializers
from pydantic import TypeAdapter, ValidationError
from pulp_python.app.provenance import Attestation
from pulp_python.app.utils import DIST_EXTENSIONS, SUPPORTED_METADATA_VERSIONS
from pulpcore.plugin.models import Artifact
from pulpcore.plugin.util import get_domain
Expand Down Expand Up @@ -70,6 +72,11 @@ class PackageUploadSerializer(serializers.Serializer):
required=False,
choices=SUPPORTED_METADATA_VERSIONS,
)
attestations = serializers.JSONField(
required=False,
help_text=_("A JSON list containing attestations for the package."),
write_only=True,
)

def validate(self, data):
"""Validates the request."""
Expand Down Expand Up @@ -98,6 +105,14 @@ def validate(self, data):
}
)

if attestations := data.get("attestations"):
try:
attestations = TypeAdapter(list[Attestation]).validate_python(attestations)
except ValidationError as e:
raise serializers.ValidationError(
{"attestations": _("The uploaded attestations are not valid: {}".format(e))}
)

sha256 = data.get("sha256_digest")
digests = {"sha256": sha256} if sha256 else None
artifact = Artifact.init_and_validate(file, expected_digests=digests)
Expand Down
26 changes: 18 additions & 8 deletions pulp_python/app/pypi/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,53 +181,63 @@ def upload(self, request, path):
serializer = PackageUploadSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
artifact, filename = serializer.validated_data["content"]
attestations = serializer.validated_data.get("attestations", None)
repo_content = self.get_content(self.get_repository_version(self.distribution))
if repo_content.filter(filename=filename).exists():
return HttpResponseBadRequest(reason=f"Package {filename} already exists in index")

if settings.PYTHON_GROUP_UPLOADS:
return self.upload_package_group(repo, artifact, filename, request.session)
return self.upload_package_group(
repo, artifact, filename, attestations, request.session
)

result = dispatch(
tasks.upload,
exclusive_resources=[artifact, repo],
kwargs={
"artifact_sha256": artifact.sha256,
"filename": filename,
"attestations": attestations,
"repository_pk": str(repo.pk),
},
)
return OperationPostponedResponse(result, request)

def upload_package_group(self, repo, artifact, filename, session):
def upload_package_group(self, repo, artifact, filename, attestations, session):
"""Steps 4 & 5, spawns tasks to add packages to index."""
start_time = datetime.now(tz=timezone.utc) + timedelta(seconds=5)
task = "updated"
if not session.get("start"):
task = self.create_group_upload_task(session, repo, artifact, filename, start_time)
task = self.create_group_upload_task(
session, repo, artifact, filename, attestations, start_time
)
else:
sq = Session.objects.select_for_update(nowait=True).filter(pk=session.session_key)
try:
with transaction.atomic():
sq.first()
current_start = datetime.fromisoformat(session["start"])
if current_start >= datetime.now(tz=timezone.utc):
session["artifacts"].append((str(artifact.sha256), filename))
session["artifacts"].append((str(artifact.sha256), filename, attestations))
session["start"] = str(start_time)
session.modified = False
session.save()
else:
raise DatabaseError
except DatabaseError:
session.cycle_key()
task = self.create_group_upload_task(session, repo, artifact, filename, start_time)
task = self.create_group_upload_task(
session, repo, artifact, filename, attestations, start_time
)
data = {"session": session.session_key, "task": task, "task_start_time": start_time}
return Response(data=data)

def create_group_upload_task(self, cur_session, repository, artifact, filename, start_time):
def create_group_upload_task(
self, cur_session, repository, artifact, filename, attestations, start_time
):
"""Creates the actual task that adds the packages to the index."""
cur_session["start"] = str(start_time)
cur_session["artifacts"] = [(str(artifact.sha256), filename)]
cur_session["artifacts"] = [(str(artifact.sha256), filename, attestations)]
cur_session.modified = False
cur_session.save()
task = dispatch(
Expand Down Expand Up @@ -536,7 +546,7 @@ def retrieve(self, request, path, package, version, filename):
name__normalize=package, version=version, filename=filename
).first()
if package_content:
provenance = PackageProvenance.objects.filter(package=package_content).first()
provenance = self.get_provenances(repo_ver).filter(package=package_content).first()
if provenance:
return Response(data=provenance.provenance)
return HttpResponseNotFound(f"{package} {version} {filename} provenance does not exist.")
Loading