Skip to content

Harden RepositoryVersion against memoization issues#7311

Merged
dralley merged 1 commit intopulp:mainfrom
dralley:check-versions
Feb 17, 2026
Merged

Harden RepositoryVersion against memoization issues#7311
dralley merged 1 commit intopulp:mainfrom
dralley:check-versions

Conversation

@dralley
Copy link
Contributor

@dralley dralley commented Feb 11, 2026

📜 Checklist

See: Pull Request Walkthrough

@dralley
Copy link
Contributor Author

dralley commented Feb 11, 2026

Working on getting some additional tests + making the output of the data repair command more reasonable. Also fixing failures.

Also adding tests for add / remove / contents >65535 packages re: #5375

Question: should I leave a copy of the old data repair command until 3.115?

@dralley dralley force-pushed the check-versions branch 10 times, most recently from 012e5cd to 36bd060 Compare February 16, 2026 03:35
@dralley dralley force-pushed the check-versions branch 3 times, most recently from cfb77c2 to 7e70dd7 Compare February 16, 2026 04:03
@dralley dralley force-pushed the check-versions branch 2 times, most recently from 1309c12 to 4741e69 Compare February 17, 2026 04:37
content_artifact = ContentArtifact.objects.create(content=content, relative_path="test/path")

# Attempting to add a queryset of ContentArtifact (not Content) should raise AssertionError
with pytest.raises(AssertionError):
Copy link
Member

@mdellweg mdellweg Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type analysis would have caught that too, i think. But we don't have that. So 👍.

Comment on lines +533 to +538
with repository.new_version() as version2:
# Remove content that doesn't exist (no-op)
version2.remove_content(Content.objects.filter(pk=9999999))
# Add content that already exists (no-op)
version2.add_content(Content.objects.filter(pk=content_pks[0]))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also have as similar test, where the single content is actually removed and then added again so still no new version is generated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, test_version_identical_to_previous

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that tests "adding and then removing" the same content, so I will extend to also cover "removing and then adding" the same content

@dralley dralley force-pushed the check-versions branch 2 times, most recently from 325dee4 to e088712 Compare February 17, 2026 20:29
@dralley dralley marked this pull request as ready for review February 17, 2026 20:29
"""
with transaction.atomic():
# relatively inexpensive sanity check for memoization
assert len(self.content_ids) == self._content_relationships().count()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worthwhile?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically this is checking that at the end of squash(), that the 'cached' content_ids field of the repoversion, has the same number of elements as recomputing - so this would be like a last-ditch chance to catch coding-errors in "someone's" new-version-creation code, yeah?

Even if it's "just" a count() of the recompute-query - is this actually "inexpensive"? (thinking out loud here) It would only happen once at new-version-squash-time, so that may not be a big issue.

Copy link
Contributor Author

@dralley dralley Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking here is that it's at least not any more expensive than the operations that already have to happen at the beginning of the repository version create (basically this same query) and at the end, including the annotates below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And probably much less so

closes: 7272

Assisted By: claude (tests only)
Copy link
Contributor

@ggainey ggainey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - and thanks for the extensive test-suite additions!

@dralley dralley enabled auto-merge (rebase) February 17, 2026 22:00
@dralley dralley merged commit a6c1a8e into pulp:main Feb 17, 2026
13 checks passed
@dralley dralley deleted the check-versions branch February 17, 2026 22:18
@patchback
Copy link

patchback bot commented Feb 17, 2026

Backport to 3.73: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply a6c1a8e on top of patchback/backports/3.73/a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa/pr-7311

Backporting merged PR #7311 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/pulp/pulpcore.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.73/a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa/pr-7311 upstream/3.73
  4. Now, cherry-pick PR Harden RepositoryVersion against memoization issues #7311 contents into that branch:
    $ git cherry-pick -x a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa
    If it'll yell at you with something like fatal: Commit a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Harden RepositoryVersion against memoization issues #7311 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.73/a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa/pr-7311
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@patchback
Copy link

patchback bot commented Feb 17, 2026

Backport to 3.103: 💚 backport PR created

✅ Backport PR branch: patchback/backports/3.103/a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa/pr-7311

Backported as #7328

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@patchback
Copy link

patchback bot commented Feb 17, 2026

Backport to 3.85: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply a6c1a8e on top of patchback/backports/3.85/a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa/pr-7311

Backporting merged PR #7311 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/pulp/pulpcore.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.85/a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa/pr-7311 upstream/3.85
  4. Now, cherry-pick PR Harden RepositoryVersion against memoization issues #7311 contents into that branch:
    $ git cherry-pick -x a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa
    If it'll yell at you with something like fatal: Commit a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Harden RepositoryVersion against memoization issues #7311 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.85/a6c1a8ec470ccbc18bbb0512dfa6aafa4429e6fa/pr-7311
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants