Skip to content

Conversation

@mvo5
Copy link
Collaborator

@mvo5 mvo5 commented Dec 17, 2025

We recently had two issues where creating our buildroot was
broken in certain situations because the local rpm would
not be compatible with the requirements of the package
that got installed in the buildroot:
https://issues.redhat.com/browse/RHEL-128741
#413

The issue here is that the local rpm is something we do
not control but we need a way to "bootstrap" our buildroot
(once we have a buildroot we use that for everything else).

This commit enables the "bootstrap" container feature of
the images library by default to avoid this dependency.
This means that we "podman pull" a minimal container (e.g.
ubi for rhel) that contains python3 and rpm to then use it
to install our real buildroot.

Note that this enables it all the time even if the host
distribution and target distribution match. The reason
is rebuildability - ie. the same manifest should always
produce the same result and in the general case we do
not know if e.g. a manifest that was part of
image-builder build --with-manifest is used somewhere
else again.

For restricted environment where pulling a container is a
problem or for situations where its known that the host
rpm is fine we provide: --without-bootstrap-container
to disable this function.

mvo5 added 2 commits December 17, 2025 10:28
When we generate the osbuild manifest we need to take a bunch
of commandline options into account. These are collected and
passed via `manifestOptions` in the CLI. There is a (small)
overlap with options that are then "converted" to options that
need to be passed to `manifestgen` via `manifestgen.Options`.

Previously there was manual code for this but its slightly
nicer to just embedd the `manifestgen.Options` into the
more general `manifestOptions` of the CLI. This avoid some
boilerplate code and might serve as a useful pattern in other
places. So this commit does that now (even though the wins
are not that big).
We recently had two issues where creating our buildroot was
broken in certain situations because the local rpm would
not be compatible with the requirements of the package
that got installed in the buildroot:
https://issues.redhat.com/browse/RHEL-128741
osbuild#413

The issue here is that the local rpm is something we do
not control but we need a way to "bootstrap" our buildroot
(once we have a buildroot we use that for everything else).

This commit enables the "bootstrap" container feature of
the images library by default to avoid this dependency.
This means that we "podman pull" a minimal container (e.g.
ubi for rhel) that contains python3 and rpm to then use it
to install our real buildroot.

Note that this enables it all the time even if the host
distribution and target distribution match. The reason
is rebuildability - ie. the same manifest should always
produce the same result and in the general case we do
not know if e.g. a manifest that was part of
`image-builder build --with-manifest` is used somewhere
else again.

For restricted environment where pulling a container is a
problem or for situations where its known that the host
rpm is fine we provide: `--without-bootstrap-container`
to disable this function.
@mvo5 mvo5 requested a review from a team as a code owner December 17, 2025 10:36
@mvo5 mvo5 requested review from bcl, lzap and supakeen and removed request for a team December 17, 2025 10:36
Copy link
Member

@supakeen supakeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be enabled by default in locations where we do not have network access such as when running in Koji or during RPM builds. We need a plan for this so I'm requesting changes to block this PR until we have one :)

We will need to update the koji-image-builder plugin to pass the --without-bootstrap-container argument as we don't have access there. This will then need to be coordinated with Fedora and CentOS release engineering to be deployed before we can publish a version of image-builder with this behavior.

This is a chicken-and-egg problem; we cannot unconditionally set this argument until it exists and I'd really prefer to not do any version inspection in koji-image-builder so I hope someone has a cool idea around this :)


As an aside, if we're doing this then our bootstrap containers must be under the control of the distribution we're building with them. Many of them currently are not leading to 3rd party containers with non-direct relationships being used? That irks me a bit.

We also open our selves up to not having direct control over these bootstrap containers so if anything we use from them (we've previously had to request inclusion of certain packages in them) disappears builds break. To me this is a more likely scenario since it has happened a few times already then the current case where RPM is broken in all distributions except RHEL.

While it is probably generally an OK idea to always bootstrap as it squats a class of bugs, it also means we won't find certain bugs.

Overall our approach has worked OK over the years and this change is probably (?) mostly motivated by two (quite recent) incidents, both related to the same RHEL change (PQC) where I personally might not see a pattern yet.

@Conan-Kudo
Copy link

Conan-Kudo commented Dec 18, 2025

The way we do it in kiwi is we use kiwi itself to create the bootstrap environment. Then use the bootstrap environment to create the "real" one. I'd recommend that for osbuild too.

@supakeen
Copy link
Member

The way we do it in kiwi is we use kiwi itself to create the bootstrap environment. Then use the bootstrap environment to create the "real" one. I'd recommend that for osbuild too.

The bootstrap here comes before the buildroot; it's a bit of a chicken-and-egg. This PR would enable (by default) to pull down a container to get an environment to build the buildroot so no host executables ever get touched by osbuild; as opposed to using the hosts rpm, and cp to set up the buildroot.

For Kiwi do you use any host executable to set up any part of the bootstrap/buildroot (excluding usage of boxbuild/stackbuild)? If not, how do you get around the egg?

@Conan-Kudo
Copy link

Well, one big difference we have from osbuild is that we don't use rpm directly in the ordinary course of events. We use DNF, and we can choose to have repository content signature checking switched off, and as long as DNF correctly handles that state, we're good.

Your situation is complicated because you don't use DNF for both download and install stages. I still think you should retire the direct usage of rpm and instead use DNF to do it. DNF already turns off signature checking for "local" RPMs by default, I think. So this issue just wouldn't come up for you unless you want down the path of creating a local rpm repository and used that repository as an input (but that would be fixed by just having pkg_gpgcheck=0 for that repository too).

@supakeen
Copy link
Member

supakeen commented Dec 19, 2025

Well, one big difference we have from osbuild is that we don't use rpm directly in the ordinary course of events. We use DNF, and we can choose to have repository content signature checking switched off, and as long as DNF correctly handles that state, we're good.

Your situation is complicated because you don't use DNF for both download and install stages. I still think you should retire the direct usage of rpm and instead use DNF to do it. DNF already turns off signature checking for "local" RPMs by default, I think. So this issue just wouldn't come up for you unless you want down the path of creating a local rpm repository and used that repository as an input (but that would be fixed by just having pkg_gpgcheck=0 for that repository too).

I'm (personally) mostly waiting for the entire transaction serialization/deserialization in DNF to become non-experimental 1 and gain Python bindings; that would allow for the exact same workflow as we have currently while also giving us the guarantees we need to have and using DNF on both sides :)

Turning off signature checking also solves this problem; but we'd prefer to verify by default if they are understood by the underlying crypto-backend which is what's currently going wrong on Fedora some hosts (neither DNF, nor RPM understand the crypto-backend saying it doesn't support a certain algorithm and instead treats it as a failed signature).

For RHEL 9 this was solved by adding a DNF plugin that uses pqrpm to verify the signatures; instead of rpm. For RHEL 10 RPM and thus DNF understand the signatures directly.

But this is a side-track to the bootstrap-by-default story.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants