Skip to content

Commit f77550f

Browse files
committed
Add attestation upload support
fixes: #984
1 parent 4753a42 commit f77550f

File tree

13 files changed

+485
-32
lines changed

13 files changed

+485
-32
lines changed

CHANGES/984.feature

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Added attestations field to package upload that will create a PEP 740 Provenance object for that content.

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,5 @@ include functest_requirements.txt
99
include test_requirements.txt
1010
include unittest_requirements.txt
1111
include pulp_python/app/webserver_snippets/*
12+
include pulp_python/tests/functional/assets/*
1213
exclude releasing.md

docs/user/guides/attestation.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# Attestation Hosting (PEP 740)
2+
3+
Pulp Python has support for uploading attestations as originally specified in [PEP 740](https://peps.python.org/pep-0740/).
4+
Attestations are stored in Pulp as Provenance Content that can be added/synced/removed from python
5+
repositories. The provenance objects will be available through the Simple API and served by the
6+
[Integrity API matching PyPI's implementation](https://docs.pypi.org/api/integrity/).
7+
8+
## Uploading Attestations
9+
10+
Attestations can be uploaded to Pulp with its package as a JSON list under the field `attestations`.
11+
12+
```bash
13+
att=$(jq '[.]' twine-6.2.0.tar.gz.publish.attestation)
14+
# multiple attestation files can be combined using --slurp and '.', jq --slurp '.' att1 att2 ...
15+
http POST $PULP_API/pulp/api/v3/content/python/packages/ \
16+
repository="$PYTHON_REPO_HREF" \
17+
relative_path=twine-6.2.0.tar.gz \
18+
artifact=$PACKAGE_ARTIFACT_PRN \
19+
attestations:="$att"
20+
```
21+
22+
The uploaded attestations can be found in the created Provenance object attached to the content in
23+
the task report.
24+
25+
```json
26+
// Task output abbreviated
27+
{
28+
"pulp_href": "/pulp/api/v3/tasks/019af033-c8e8-7a02-a583-0fac5e39e54b/",
29+
"state": "completed",
30+
"name": "pulpcore.app.tasks.base.general_create",
31+
"created_resources": [
32+
"/pulp/api/v3/content/python/provenance/019aeb59-34bb-7ae4-ab95-4f8a62199be9/",
33+
"/pulp/api/v3/content/python/packages/019aeb59-34b1-7c73-a746-aea2cc3fbd85/"
34+
],
35+
"result": {
36+
"prn": "prn:python.pythonpackagecontent:019aeb59-34b1-7c73-a746-aea2cc3fbd85",
37+
"name": "twine",
38+
"sha256": "418ebf08ccda9a8caaebe414433b0ba5e25eb5e4a927667122fbe8f829f985d8",,
39+
"version": "6.2.0",
40+
"artifact": "/pulp/api/v3/artifacts/019aeb59-33c3-7877-9787-22c34eb6c15b/",
41+
"filename": "twine-6.2.0.tar.gz",
42+
"pulp_href": "/pulp/api/v3/content/python/packages/019aeb59-34b1-7c73-a746-aea2cc3fbd85/",
43+
// PRN of newly created Provenance object
44+
"provenance": "prn:python.packageprovenance:019aeb59-34bb-7ae4-ab95-4f8a62199be9",
45+
}
46+
}
47+
```
48+
49+
You can also use twine to upload your packages. Twine will find the attestations in files ending with
50+
`.attestation` and attach them to the same filename during the upload. Pulp will then add the new
51+
package and provenance object to the backing repository of the distribution.
52+
53+
```bash
54+
pulp python distribution create --name foo --base-path foo --repository foo
55+
pypi-attestations sign dist/twine-6.2.0.tar.gz dist/twine-6.2.0-py3-none-any.whl
56+
twine upload --repository-url $PULP_API/pypi/foo/simple/ --attestations dist/*
57+
```
58+
59+
## Interacting with Provenance Content
60+
61+
Provenance content can be directly uploaded to Pulp through its content endpoint.
62+
63+
```bash
64+
http POST $PULP_API/pulp/api/v3/content/python/provenance/ --form \
65+
file@twine.provenance \
66+
package="$PACKAGE_PRN" \
67+
repository="$REPO_PRN"
68+
```
69+
70+
Provenance objects are artifactless content, their data is stored in a json field and are unique by
71+
their sha256 digest. In a repository a provenance object is unique by their associated package, i.e
72+
a package can only have one provenance in the repository at a time. Provenance objects can't be
73+
modified after upload as content is immutable, but a new one can be uploaded to replace the existing
74+
one. Since provenance objects are content they can be added, removed, and synced into repositories.
75+
To sync provenance objects from an upstream repository set the `provenance` field on the remote.
76+
77+
```bash
78+
http PATCH $PULP_API/$FOO_REMOTE_HREF provenance=true
79+
pulp python repository sync --repository foo --remote foo
80+
```
81+
82+
## Downloading Provenance objects
83+
84+
A package's provenance objects are exposed through its Simple page and downloaded from the Integrity
85+
API. The attestations can then be verified using tools like `sigstore` or `pypi-attestations`.
86+
87+
```bash
88+
http $PULP_API/pypi/foo/simple/twine/ "Accept:application/vnd.pypi.simple.v1+json" | jq -r ".files[].provenance"
89+
90+
http $PULP_API/pypi/foo/integrity/twine/6.2.0/twine-6.2.0.tar.gz/
91+
```

docs/user/learn/tech-preview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,4 @@ The following features are currently being released as part of a tech preview
1010
- Create pull-through caches of remote sources.
1111
- Pulp Domain Support
1212
- RBAC support
13+
- PEP 740 attestations upload and provenance syncing/serving.

pulp_python/app/models.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@
2121
)
2222
from pulpcore.plugin.responses import ArtifactResponse
2323

24-
from pypi_attestations import Provenance
2524
from pathlib import PurePath
25+
from .provenance import Provenance
2626
from .utils import (
2727
artifact_to_python_content_data,
2828
canonicalize_name,

pulp_python/app/provenance.py

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
from typing import Annotated, Literal, Union, get_args
2+
3+
from pydantic import BaseModel, ConfigDict, Field
4+
from pydantic.alias_generators import to_snake
5+
from pypi_attestations import (
6+
Attestation,
7+
Distribution,
8+
Publisher,
9+
)
10+
11+
12+
class _PermissivePolicy:
13+
"""A permissive verification policy that always succeeds."""
14+
15+
def verify(self, cert):
16+
"""Succeed regardless of the publisher's identity."""
17+
pass
18+
19+
20+
class AnyPublisher(BaseModel):
21+
"""A fallback publisher for any kind not matching other publisher types."""
22+
23+
model_config = ConfigDict(alias_generator=to_snake, extra="allow")
24+
25+
kind: str
26+
27+
def _as_policy(self):
28+
"""Return a permissive policy that always succeed."""
29+
return _PermissivePolicy()
30+
31+
32+
# Get the underlying Union type of the original Publisher
33+
# Publisher is Annotated[Union[...], Field(discriminator="kind")]
34+
_OriginalPublisherTypes = get_args(Publisher.__origin__)
35+
# Add AnyPublisher to the list of original publisher types
36+
_ExtendedPublisherTypes = (*_OriginalPublisherTypes, AnyPublisher)
37+
_ExtendedPublisherUnion = Union[_ExtendedPublisherTypes]
38+
# Create a new type that fallbacks to AnyPublisher
39+
ExtendedPublisher = Annotated[_ExtendedPublisherUnion, Field(union_mode="left_to_right")]
40+
41+
42+
class AttestationBundle(BaseModel):
43+
"""
44+
AttestationBundle object as defined in PEP740.
45+
46+
PyPI only accepts attestations from TrustedPublishers (GitHub, GitLab, Google), but we will
47+
accept from any user.
48+
"""
49+
50+
publisher: ExtendedPublisher
51+
attestations: list[Attestation]
52+
53+
54+
class Provenance(BaseModel):
55+
"""Provenance object as defined in PEP740."""
56+
57+
version: Literal[1] = 1
58+
attestation_bundles: list[AttestationBundle]
59+
60+
61+
def verify_provenance(filename, sha256, provenance, offline=False):
62+
"""Verify the provenance object is valid for the package."""
63+
dist = Distribution(name=filename, digest=sha256)
64+
for bundle in provenance.attestation_bundles:
65+
publisher = bundle.publisher
66+
policy = publisher._as_policy()
67+
for attestation in bundle.attestations:
68+
sig_bundle = attestation.to_bundle()
69+
checkpoint = sig_bundle.log_entry._inner.inclusion_proof.checkpoint
70+
staging = "sigstage.dev" in checkpoint.envelope
71+
attestation.verify(policy, dist, staging=staging, offline=offline)

pulp_python/app/pypi/serializers.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
from gettext import gettext as _
33

44
from rest_framework import serializers
5+
from pydantic import TypeAdapter, ValidationError
6+
from pulp_python.app.provenance import Attestation
57
from pulp_python.app.utils import DIST_EXTENSIONS, SUPPORTED_METADATA_VERSIONS
68
from pulpcore.plugin.models import Artifact
79
from pulpcore.plugin.util import get_domain
@@ -70,6 +72,11 @@ class PackageUploadSerializer(serializers.Serializer):
7072
required=False,
7173
choices=SUPPORTED_METADATA_VERSIONS,
7274
)
75+
attestations = serializers.JSONField(
76+
required=False,
77+
help_text=_("A JSON list containing attestations for the package."),
78+
write_only=True,
79+
)
7380

7481
def validate(self, data):
7582
"""Validates the request."""
@@ -98,6 +105,14 @@ def validate(self, data):
98105
}
99106
)
100107

108+
if attestations := data.get("attestations"):
109+
try:
110+
attestations = TypeAdapter(list[Attestation]).validate_python(attestations)
111+
except ValidationError as e:
112+
raise serializers.ValidationError(
113+
{"attestations": _("The uploaded attestations are not valid: {}".format(e))}
114+
)
115+
101116
sha256 = data.get("sha256_digest")
102117
digests = {"sha256": sha256} if sha256 else None
103118
artifact = Artifact.init_and_validate(file, expected_digests=digests)

pulp_python/app/pypi/views.py

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -181,53 +181,63 @@ def upload(self, request, path):
181181
serializer = PackageUploadSerializer(data=request.data)
182182
serializer.is_valid(raise_exception=True)
183183
artifact, filename = serializer.validated_data["content"]
184+
attestations = serializer.validated_data.get("attestations", None)
184185
repo_content = self.get_content(self.get_repository_version(self.distribution))
185186
if repo_content.filter(filename=filename).exists():
186187
return HttpResponseBadRequest(reason=f"Package {filename} already exists in index")
187188

188189
if settings.PYTHON_GROUP_UPLOADS:
189-
return self.upload_package_group(repo, artifact, filename, request.session)
190+
return self.upload_package_group(
191+
repo, artifact, filename, attestations, request.session
192+
)
190193

191194
result = dispatch(
192195
tasks.upload,
193196
exclusive_resources=[artifact, repo],
194197
kwargs={
195198
"artifact_sha256": artifact.sha256,
196199
"filename": filename,
200+
"attestations": attestations,
197201
"repository_pk": str(repo.pk),
198202
},
199203
)
200204
return OperationPostponedResponse(result, request)
201205

202-
def upload_package_group(self, repo, artifact, filename, session):
206+
def upload_package_group(self, repo, artifact, filename, attestations, session):
203207
"""Steps 4 & 5, spawns tasks to add packages to index."""
204208
start_time = datetime.now(tz=timezone.utc) + timedelta(seconds=5)
205209
task = "updated"
206210
if not session.get("start"):
207-
task = self.create_group_upload_task(session, repo, artifact, filename, start_time)
211+
task = self.create_group_upload_task(
212+
session, repo, artifact, filename, attestations, start_time
213+
)
208214
else:
209215
sq = Session.objects.select_for_update(nowait=True).filter(pk=session.session_key)
210216
try:
211217
with transaction.atomic():
212218
sq.first()
213219
current_start = datetime.fromisoformat(session["start"])
214220
if current_start >= datetime.now(tz=timezone.utc):
215-
session["artifacts"].append((str(artifact.sha256), filename))
221+
session["artifacts"].append((str(artifact.sha256), filename, attestations))
216222
session["start"] = str(start_time)
217223
session.modified = False
218224
session.save()
219225
else:
220226
raise DatabaseError
221227
except DatabaseError:
222228
session.cycle_key()
223-
task = self.create_group_upload_task(session, repo, artifact, filename, start_time)
229+
task = self.create_group_upload_task(
230+
session, repo, artifact, filename, attestations, start_time
231+
)
224232
data = {"session": session.session_key, "task": task, "task_start_time": start_time}
225233
return Response(data=data)
226234

227-
def create_group_upload_task(self, cur_session, repository, artifact, filename, start_time):
235+
def create_group_upload_task(
236+
self, cur_session, repository, artifact, filename, attestations, start_time
237+
):
228238
"""Creates the actual task that adds the packages to the index."""
229239
cur_session["start"] = str(start_time)
230-
cur_session["artifacts"] = [(str(artifact.sha256), filename)]
240+
cur_session["artifacts"] = [(str(artifact.sha256), filename, attestations)]
231241
cur_session.modified = False
232242
cur_session.save()
233243
task = dispatch(
@@ -536,7 +546,7 @@ def retrieve(self, request, path, package, version, filename):
536546
name__normalize=package, version=version, filename=filename
537547
).first()
538548
if package_content:
539-
provenance = PackageProvenance.objects.filter(package=package_content).first()
549+
provenance = self.get_provenances(repo_ver).filter(package=package_content).first()
540550
if provenance:
541551
return Response(data=provenance.provenance)
542552
return HttpResponseNotFound(f"{package} {version} {filename} provenance does not exist.")

0 commit comments

Comments
 (0)