Simulate numa nodes by angelcerveraroldan · Pull Request #4428 · coreos/coreos-assembler

angelcerveraroldan · 2026-02-03T17:05:57Z

Add an option for external tests to be executed by a machine with multiple numa nodes.

openshift-ci · 2026-02-03T17:06:01Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

gemini-code-assist

Code Review

This pull request introduces the capability to simulate NUMA nodes in QEMU for external tests. The changes involve adding a NumaNodes option and plumbing it through various layers of the test harness and platform configuration. The core logic for generating QEMU arguments for NUMA is in mantle/platform/qemu.go.

I've identified a couple of issues in the implementation within mantle/platform/qemu.go that could lead to incorrect resource allocation for the simulated NUMA nodes. My review includes suggestions to fix these bugs.

mantle/platform/qemu.go

mantle/kola/tests/misc/numa.go

mantle/kola/register/register.go

ravanelli

Great job, it is looking good, let's squash the commits and follow the suggestion to simply it with using on/off with 2 NUMA nodes

mantle/kola/harness.go

mantle/platform/machine/qemu/cluster.go

mantle/platform/qemu.go

ravanelli · 2026-02-11T14:58:23Z

mantle/platform/qemu.go

-		kvm = false
-	}
-	machineArg += "," + accel
+func platformQemuArgs(arch, machineArg string, kvm bool) ([]string, error) {


You did a split in this function and also created baseNumaQemuArgs that shares a lot of the same code, let's try to merge the duplication

mantle/platform/qemu.go

angelcerveraroldan · 2026-02-12T14:55:11Z

/test all

New boolean flag "numaNodes" will simulate two NUMA nodes. They will divide the memory and cpus between them.

ravanelli

A few comments

ravanelli · 2026-02-16T17:49:58Z

mantle/platform/qemu.go

+	node0Cpus := cpus / 2
+
+	ret = append(ret, "-object", fmt.Sprintf("memory-backend-memfd,id=%s,size=%dM,share=on", node0MemoryDevice, node0Memory))
+	ret = append(ret, "-object", fmt.Sprintf("memory-backend-memfd,id=%s,size=%dM,share=on", node1MemoryDevice, memoryMiB-node0Memory))


Here, you are using node0Memory for one node and memoryMiB - node0Memory for the second. Why is that? It seems this could result in one NUMA node having more memory than the other? If so, the statement "They will split the machine's memory/CPUs evenly between them" won't be true

It is for cases where the memory is not exactly divisible by two. The memory of the nodes when added together, must equal exactly the total memory of the machine or there is an error.

If the memory is not divisible by 2, one node may have slightly more memory than the other one (one MiB more).

ravanelli · 2026-02-16T17:50:59Z

mantle/platform/qemu.go

+
+	ret = append(ret, "-object", fmt.Sprintf("memory-backend-memfd,id=%s,size=%dM,share=on", node0MemoryDevice, node0Memory))
+	ret = append(ret, "-object", fmt.Sprintf("memory-backend-memfd,id=%s,size=%dM,share=on", node1MemoryDevice, memoryMiB-node0Memory))
+	ret = append(ret, "-numa", fmt.Sprintf("node,memdev=%s,cpus=%d-%d,nodeid=0", node0MemoryDevice, 0, node0Cpus-1))


What does 0, does here? That the second does not need it?

I'm guessing you used this doc? https://www.qemu.org/docs/master/system/invocation.html#cmdoption-smp
Can you add the link for it, so we can understand what this code is doing?
Also, why you did not used the socket option? -numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1

That is the docs I used, yes. Should I add the link as a comment in the codebase, or just to this PR ?

There was no big reason why I didnt use the socket flag. Would you prefer it be used when defining the nodes ?

The 0 is just the starting CPU for node0. The reason it isn't there for the other node is that they are both being given different CPUs. Right now, if your machine has 4 cpus, node0 would be given cpus=0-1 and node1 would be given cpus=2-3.

Maybe it would simplify the code if we hard coded it to be 2 cpus, one for each node, or even simpler if we gave node 0 one cpu, and node 1 only has memory.

ravanelli · 2026-02-16T17:56:44Z

mantle/platform/qemu.go

+		return nil, fmt.Errorf("Must have at least 2 cpus to simulate NUMA nodes")
+	}
+
+	ret, err := platformQemuArgs(arch, "")


In this case we have the output for with a comma for no reason:
machineArg = accelArg + "," + machineArg
Something like, machineArg=accel=tcg,

openshift-ci bot added the do-not-merge/work-in-progress label Feb 3, 2026

gemini-code-assist bot reviewed Feb 3, 2026

View reviewed changes

mantle/platform/qemu.go Outdated Show resolved Hide resolved

mantle/platform/qemu.go Outdated Show resolved Hide resolved

travier reviewed Feb 4, 2026

View reviewed changes

mantle/kola/tests/misc/numa.go Outdated Show resolved Hide resolved

travier reviewed Feb 4, 2026

View reviewed changes

mantle/kola/register/register.go Outdated Show resolved Hide resolved

ravanelli reviewed Feb 11, 2026

View reviewed changes

angelcerveraroldan force-pushed the numa-nodes-option branch from 52601f4 to 774b30c Compare February 12, 2026 13:26

angelcerveraroldan force-pushed the numa-nodes-option branch from 774b30c to 5590322 Compare February 12, 2026 15:32

Allow for NUMA node simulation

aafd091

New boolean flag "numaNodes" will simulate two NUMA nodes. They will divide the memory and cpus between them.

angelcerveraroldan force-pushed the numa-nodes-option branch from 5590322 to aafd091 Compare February 12, 2026 15:46

ravanelli reviewed Feb 16, 2026

View reviewed changes

angelcerveraroldan removed the do-not-merge/work-in-progress label Feb 17, 2026

angelcerveraroldan marked this pull request as ready for review February 17, 2026 11:04

Comments

Conversation

angelcerveraroldan commented Feb 3, 2026

Uh oh!

openshift-ci bot commented Feb 3, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ravanelli left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ravanelli Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

angelcerveraroldan commented Feb 12, 2026

Uh oh!

ravanelli left a comment

Choose a reason for hiding this comment

Uh oh!

ravanelli Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

angelcerveraroldan Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

ravanelli Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

ravanelli Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

angelcerveraroldan Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ravanelli Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ravanelli Feb 11, 2026 •

edited

Loading