Skip to content

Conversation

@evidolob
Copy link

Replace wmi calls with powershell commands. This should improve error messages and remove need of executing this code with escalating privileges. The workflow should be the same, the most changes is in 'MemorySettings' memory field names and types, it should be set in bytes. And 'ProcessorSettings' has other name for processors count field.
Also this PR adds new sample cli 'managevm' which allow to start, stop, delete VM

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 22, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: evidolob
Once this PR has been reviewed and has the lgtm label, please assign l0rd for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lstocchi
Copy link
Contributor

I didn't dig into the code but at first sight the question i have is, can your split it in smaller commits? Reviewing the first one would be hard otherwise.

@evidolob
Copy link
Author

@lstocchi I not sure, this PR is basically rewrite all project, I could split first commit in to several smaller, but it would be very hard to keep project buildable.

@baude
Copy link
Member

baude commented Oct 23, 2025

@n1hility could you please spend at least a minute or two and review this PR because it fundamentally is a switch in how we originally wrote this.

err = wmiext.WaitJob(service, job)
} else {
err = &wmiext.JobError{ErrorCode: int(ret)}
func getKvpSctipt(vmName string, op string, key string, value string) []string {
Copy link
Contributor

@lstocchi lstocchi Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried with podman and the ignition fails. You can replicate by init/start a new machine.
I did some changes to make it work but this requires more investigation, i just wanted to track where the issue was.

Summarizing what i tried, I escaped value to be sure it did not break the quotes in the script, removed comments, changed the arg of the -Command flag in Execute as this should expect a string.

Another thing i noticed is that removing a machine was really slow. Maybe it was my fault but worth mentioning to give it a deeper look

There is a typo in the func name, it should be getKvpScript

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can also be simplified by removing the Write-Host ... as they are never printed or update the code to print them when using --log-level debug

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write-Host is to write text in to stdout from powershell, I could add re-output from go code if debug logging is enabled

// parameter.
func (vmm *VirtualMachineManager) GetSummaryInformation(requestedFields SummaryRequestSet) ([]SummaryInformation, error) {
return vmm.getSummaryInformation("", requestedFields)
func (vmm *VirtualMachineManager) GetSummaryInformation() ([]PowerShellVM, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the only alternative we have? I saw that the old SummaryInformation is completely different from PowerShellVM, not sure if someone is using it. The thing that caught my eye is the old allocatedGPU which is not present now. This could be used for AI stuff

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add GPU for vm with Set-VMGpuPartitionAdapter command, but I thing it may be a separate PR, as it involves in more changes, and introducing new API, as passing GPU is requiring modification(partitioning) on host system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my concern was more if someone is using this. It's ok to change wmi calls to powershell commands but here we're completely changing the output. So not sure if we break up something. I'll investigate a bit, maybe @gbraad have more knowledge

Copy link
Contributor

@lstocchi lstocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tried it with podman? It fails for me

NewVMName string

// ProcessorCount specifies the number of virtual processors for the virtual machine.
ProcessorCount uint32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 332 to 333
if val, ok := vmData["AutomaticStopAction"].(string); ok {
settings.AutomaticStopAction = AutomaticStopActionType(val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Comment on lines +161 to +162
args := []string{"Hyper-V\\Stop-VM", "-Name", vm.Name, "-Force"}
if force {
args = append(args, "-TurnOff")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks weird. If force is true i expect -Force to be added.
Maybe we could change the variable name here? Or provide 3 different stop actions (stop, force stop, turn off)
Based on the example it looks like when -TurnOff is used -Force is not needed -> https://learn.microsoft.com/en-us/powershell/module/hyper-v/stop-vm?view=windowsserver2025-ps#examples

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with plain stop-vm is work like Shuts down virtual machine TestVM through the guest operating system. which means guest OS need to have vm addons integrated to allow stop. In case if gest os doesn't have such integrations and you call stop-vm without -Force, the command will hung for 5-10m and fail, and VM will stuck in transition period, you cannot do anything with VM until that timeout is over, and after that VM will still continue to run.

So that is a question does we always run guest OS with hyper-v integrations to relay on stop without forcing?
cc @gbraad

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a library so i expect the caller to decide what to do. It's their choice to try a graceful stop (call Stop()), force it (call StopWithForce()) or turn it off (call TurnOff()).

I expect new OS image specific to Hyperv to have the addon enabled tbh. At least for the images used with macadam and podman i see they have the Shutdown service running. We could also verify if the service is disabled and directly returns an error to the caller (if they call Stop()) to inform they must use StopWithforce/turnoff so they do not wait 5+ mins ??

Something like

func (vm *VirtualMachine) TurnOff() error {
	return vm.stop("-TurnOff")
}

func (vm *VirtualMachine) StopWithForce() error {
	return vm.stop("-Force")
}

func (vm *VirtualMachine) Stop() error {
	return vm.stop("")
}

func (vm *VirtualMachine) stop(flag string) error {
	if !Enabled.equal(uint16(vm.State)) {
		return ErrMachineNotRunning
	}

	args := []string{"Stop-VM", "-Name", vm.Name}
	if flag != "" {
		args = append(args, flag)
	}

	_, stderr, err := powershell.Execute(args...)
	if err != nil {
		return NewPSError(stderr)
	}
	return nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as long as you dont change functionally the library call for folks, i'm cool with this.

@baude
Copy link
Member

baude commented Oct 30, 2025

this PR makes me super nervous but lets see it through. once you have tests passing here, id lke to ask you to vendor this into github.com/containers/podman and create a draft PR and ensure that PR passes the entire test quite. This might be tricky so ask for help if needed. would you be willing to do this ?

@evidolob
Copy link
Author

evidolob commented Oct 31, 2025

@baude Sure, I make draft PR for podman, and ensure that this changes doesn't brake anything on podman.

@evidolob
Copy link
Author

@lstocchi @baude Do you know how to view win_test results, when I try to look on job page it give me this:
image

@lstocchi
Copy link
Contributor

@lstocchi @baude Do you know how to view win_test results, when I try to look on job page it give me this

Same for me. I also tried to look at successful builds and i don't see any logs.
By logging in the cirrus website i just see a 404 error. I don't have the rights to dig more so no idea what's happening.

@baude
Copy link
Member

baude commented Oct 31, 2025

@lstocchi @baude Do you know how to view win_test results, when I try to look on job page it give me this

Same for me. I also tried to look at successful builds and i don't see any logs. By logging in the cirrus website i just see a 404 error. I don't have the rights to dig more so no idea what's happening.

There might be something going on with the CI; in the meanwhile, do all tests pass locally ?

@evidolob
Copy link
Author

evidolob commented Nov 3, 2025

@baude Local test are passing fine

@baude
Copy link
Member

baude commented Nov 3, 2025

@evidolob we updated something on our side and i reran your tests and they failed.

@evidolob evidolob force-pushed the rework_to_ps branch 2 times, most recently from 597b093 to 76bd5ac Compare December 9, 2025 14:44
Signed-off-by: Yevhen Vydolob <yvydolob@redhat.com>
Signed-off-by: Yevhen Vydolob <yvydolob@redhat.com>
@evidolob evidolob force-pushed the rework_to_ps branch 3 times, most recently from 1ab63e2 to bf5e121 Compare December 10, 2025 07:28
@evidolob evidolob mentioned this pull request Dec 23, 2025
6 tasks
@evidolob
Copy link
Author

@baude I made a PR with this changes in podman containers/podman#27819

@evidolob evidolob requested review from baude and lstocchi December 23, 2025 09:22
@l0rd
Copy link
Member

l0rd commented Dec 26, 2025

I apologize for commenting late, but I am trying to understand how this PR improves the current situation from a Podman perspective.

I assume the main point of this PR is to eliminate the need to run commands in elevated mode. However, after @lstocchi's work, we currently don't require elevated mode for managing Hyper-V machines. While elevated permissions are still necessary for making changes to the Windows HKLM registry entries, they are not needed for VM management. The only requirement now is to be a member of the Hyper-V admins group. How does this PR enhance this situation?

Apart from that, I believe that using PowerShell commands instead of the WMI API is not ideal. Spawning a new process can negatively impact performances, and it is less robust because we rely on the local version of PowerShell, which we cannot predict. I also have concerns related to output parsing that can be impacted by the internationalization and encoding. And we are loosing type checking on the handled objects. We should avoid this approach unless there is a critical reason to use it.

@gbraad
Copy link
Member

gbraad commented Jan 6, 2026

eliminate the need to run commands in elevated mode.

Yes and no. Elevation is a non-issue if you are part of the Hyper-V Admin group, which we enforce during our installation process. See: #216 as now any error for permission error, no disk space, no physical memory or anything, will lead to error 32788. This is TOO generic an error to be helpful. When we used powershell in minikube, minishift and the early version of CRC, we never used WMI due to the copmplexity and restricted use. We adopted libhvee to help with the maintenance, but it was impossible to get cleaner error messages from the interface. We have spend a lot of time opn this, and the only conclusion is to use the powershell cmdlets, as they interact in a much cleaener (non COM abstracted wy) with the hypervisor. If you search in our issue tracker, we were able to address Hyper-V issues easily, until issue 32788. We see a lot more production use with Hyper-V asd that is the ONLY option we offer for running CRC/OpenShifty Local on Windows.

Spawning a new process can negatively impact performances

In CRC we resolved this before by having a persistent session available. These are optimizations, as in normal use will not have many calls happening. This might happen during bring up, and in this case, during the tests. We adopted libhvee as part od convergencing implementations, but all those issues you brought up had been resolved by us; internationalization, session persistence, permission needs, so this is already a lesson learned on our end.

1 similar comment
@gbraad
Copy link
Member

gbraad commented Jan 6, 2026

eliminate the need to run commands in elevated mode.

Yes and no. Elevation is a non-issue if you are part of the Hyper-V Admin group, which we enforce during our installation process. See: #216 as now any error for permission error, no disk space, no physical memory or anything, will lead to error 32788. This is TOO generic an error to be helpful. When we used powershell in minikube, minishift and the early version of CRC, we never used WMI due to the copmplexity and restricted use. We adopted libhvee to help with the maintenance, but it was impossible to get cleaner error messages from the interface. We have spend a lot of time opn this, and the only conclusion is to use the powershell cmdlets, as they interact in a much cleaener (non COM abstracted wy) with the hypervisor. If you search in our issue tracker, we were able to address Hyper-V issues easily, until issue 32788. We see a lot more production use with Hyper-V asd that is the ONLY option we offer for running CRC/OpenShifty Local on Windows.

Spawning a new process can negatively impact performances

In CRC we resolved this before by having a persistent session available. These are optimizations, as in normal use will not have many calls happening. This might happen during bring up, and in this case, during the tests. We adopted libhvee as part od convergencing implementations, but all those issues you brought up had been resolved by us; internationalization, session persistence, permission needs, so this is already a lesson learned on our end.

@gbraad
Copy link
Member

gbraad commented Jan 6, 2026

To me this is also important: reduced complexity and providing way more clarity!

image

Replace wmi calls with powershell commands. This should improve error messages and remove need of executing this code with escalating privileges. The workflow should be the same, the most changes is in 'MemorySettings' memory field names and types, it should be set in bytes. And 'ProcessorSettings' has other name for processors count field.

Signed-off-by: Yevhen Vydolob <yvydolob@redhat.com>
Signed-off-by: Yevhen Vydolob <yvydolob@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants