Skip to content

<fix>[compute]: add syncVmDeviceInfo when vm is running#3350

Open
MatheMatrix wants to merge 1 commit into5.5.6from
sync/jin.shen/fix-67275@@2
Open

<fix>[compute]: add syncVmDeviceInfo when vm is running#3350
MatheMatrix wants to merge 1 commit into5.5.6from
sync/jin.shen/fix-67275@@2

Conversation

@MatheMatrix
Copy link
Owner

Resolves: ZSTAC-67275

Change-Id: I746e63786a677676676b6d6265657063666b7677

sync from gitlab !9179

@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

Walkthrough

新增 VmInstanceBase 私有方法 syncVmDevicesAddressInfo(String vmUuid),在 VM 启动(含 InstantStart)、重启完成、迁移完成及相关实例化/恢复流程中调用,通过 CloudBus 向主机服务发送 SyncVmDeviceInfoMsg 以触发设备地址信息同步并记录结果。

Changes

Cohort / File(s) Summary
虚拟机设备地址同步
compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java
新增私有方法 syncVmDevicesAddressInfo(String vmUuid):查找 VM 并校验 hostUuid,构建 SyncVmDeviceInfoMsg 并通过 CloudBus 发送到主机服务。该方法在 VM 启动主流程、InstantStart 实例化、迁移完成、重启完成及相关恢复/继续流程中被调用,记录成功或警告日志。

Sequence Diagram(s)

sequenceDiagram
    participant VM as VmInstanceBase
    participant Bus as CloudBus
    participant HostSvc as Host Service

    VM->>VM: VM 生命周期事件(start / instantStart / migrate complete / reboot complete)
    VM->>VM: syncVmDevicesAddressInfo(vmUuid)

    rect rgba(100,150,200,0.5)
        VM->>VM: 查询 VM 对象并校验 hostUuid
    end

    alt 有效 hostUuid
        VM->>Bus: 发送 SyncVmDeviceInfoMsg(vmUuid, hostUuid)
        Bus->>HostSvc: 转发 SyncVmDeviceInfoMsg
        HostSvc->>HostSvc: 处理同步请求
        HostSvc-->>Bus: 返回处理结果
        Bus-->>VM: 返回响应
        VM->>VM: 记录成功日志
    else 无 VM 或无 host
        VM->>VM: 记录警告/跳过日志
    end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

我是小兔子,轻敲云间鼓,
一条消息往主机路,地址对齐不迷路。
启动迁移与重启,悄悄同步不喧闹,
🐰🔔

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ⚠️ Unable to check for merge conflicts: Invalid branch name format
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed 标题符合[scope]: 格式,长度为55个字符,不超过72个字符限制,准确总结了主要变更内容。
Description check ✅ Passed 描述内容与变更集相关,提及了JIRA工单号ZSTAC-67275,反映了PR的目的和来源。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sync/jin.shen/fix-67275@@2
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch sync/jin.shen/fix-67275@@2
  • Create stacked PR with resolved conflicts
  • Post resolved changes as copyable diffs in a comment

No actionable comments were generated in the recent review. 🎉

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.40.5)
compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java

Comment @coderabbitai help to get the list of available commands and usage tips.

@MatheMatrix MatheMatrix force-pushed the sync/jin.shen/fix-67275@@2 branch from 5cc4241 to 84f6bfb Compare February 13, 2026 15:24
@ZStack-Robot
Copy link
Collaborator

Comment from ye.zou:

🔴 CRITICAL Code Review Findings

经过深入分析,发现1个阻塞性bug2个高优先级缺陷


❌ CRITICAL: NullPointerException in PciDeviceManager

位置: premium/mevoco/src/main/java/org/zstack/pciDevice/PciDeviceManager.java:2563, 2582

问题: 新增的 syncVmDevicesAddressInfo() 方法通过 SyncVmDeviceInfoMsg 触发同步,但消息处理路径 (KVMHost.java:1541) 传入的 spec 参数为 null:

// KVMHost.java:1541
extEmitter.afterReceiveSyncVmDeviceInfoResponse(
    VmInstanceInventory.valueOf(...), 
    ret, 
    null  // ← spec 是 null
);

PciDeviceManager.afterReceiveVmDeviceInfoResponse 直接解引用 spec:

// Line 2563, 2582
.eq(PciDeviceVO_.vmInstanceUuid, spec.getVmInventory().getUuid())  // ← NPE!

影响: 每次调用新同步方法都会抛出 NPE,PCI/MDEV 地址信息永远不会被持久化。整个功能失效。

修复方案: 在 PciDeviceManager.java 中添加空值保护:

String vmUuid = spec != null ? spec.getVmInventory().getUuid() : vm.getUuid();

(参考 VirtualPciDeviceKvmExtensionPoint.java:46 已有此模式)


⚠️ HIGH: 缺少 VM Migration 后的设备同步

位置: compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java:7033-7037

问题: VM 迁移后运行在新主机上,PCI 拓扑可能变化,但迁移完成处理器未调用 syncVmDevicesAddressInfo()

修复建议:

extEmitter.afterMigrateVm(vm, vm.getLastHostUuid(), new NoErrorCompletion(completion) {
    @Override
    public void done() {
        syncVmDevicesAddressInfo(self.getUuid());  // ← 添加
        completion.success();
    }
});

⚠️ HIGH: 缺少 VM Resume 后的设备同步

位置: compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java:8349

问题: Resume 流程将 VM 转为 Running 状态但未同步设备信息。如果暂停/恢复可能改变设备地址(如快照恢复),则存在数据不一致风险。

建议: 评估 pause/resume 是否影响设备拓扑。如是,添加同步调用。


🟡 MEDIUM: 冗余的数据库查询

位置: 新增方法 syncVmDevicesAddressInfo(String vmUuid)

问题: 方法内部重新查询 VmInstanceVO:

VmInstanceVO vmVO = dbf.findByUuid(vmUuid, VmInstanceVO.class);  // 冗余

VmInstanceBase 已持有 self,且 hostUuid 在前面的 changeVmStateInDb 中已设置。

优化建议: 直接使用 self.getHostUuid(),避免数据库往返。


✅ 已验证正确的部分

  • 映射方向正确(host→vm)
  • 消息路由正确(到目标主机)
  • 异步 fire-and-forget 模式合理
  • Python 端 retry 机制设计良好(3次重试,2秒间隔)

综合评估: REQUEST CHANGES

阻塞项: CRITICAL NPE 导致功能完全失效
重要缺陷: 缺少 migration 和 resume 场景的覆盖

合并前必须修复:

  1. ✅ 修复 PciDeviceManager NPE (CRITICAL)
  2. ✅ 添加 migration 后同步 (HIGH)
  3. ⚠️ 评估/记录 resume 场景 (HIGH)

@MatheMatrix MatheMatrix force-pushed the sync/jin.shen/fix-67275@@2 branch from 84f6bfb to f04efb6 Compare February 14, 2026 02:49
add syncVmDeviceInfo when vm is running

Resolves: ZSTAC-67275

Change-Id: I746e63786a677676676b6d6265657063666b7677
@MatheMatrix MatheMatrix force-pushed the sync/jin.shen/fix-67275@@2 branch from f04efb6 to ace397b Compare February 14, 2026 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants