Skip to content

ZSTAC-82207 fix stale availablePhysicalCapacity after sync#3348

Open
MatheMatrix wants to merge 1 commit into5.5.6from
sync/yaohua.wu/bugfix/ZSTAC-82207
Open

ZSTAC-82207 fix stale availablePhysicalCapacity after sync#3348
MatheMatrix wants to merge 1 commit into5.5.6from
sync/yaohua.wu/bugfix/ZSTAC-82207

Conversation

@MatheMatrix
Copy link
Owner

Summary

  • Fix: syncPhysicalCapacityInCluster() now updates LocalStorageHostRefVO physical capacity via updatePhysicalCapacityByKvmAgentResponse(), consistent with httpCall() pattern
  • Root cause: stale host-ref values were used by calculateTotalCapacity() to overwrite correct agent-reported PS-level capacity
  • Test: added LocalStorageSyncPhysicalCapacityCase

Impact

  • Only affects LocalStorage availablePhysicalCapacity calculation
  • No side effects on other storage types

Related: ZSTAC-82207

sync from gitlab !9177

syncPhysicalCapacityInCluster() collected fresh physical
capacity from KVM agents but did not update per-host
LocalStorageHostRefVO. When calculateTotalCapacity()
subsequently summed stale host-ref values, it overwrote
the correct PS-level capacity with outdated data.

1. Why is this change necessary?
After snapshot group migration, the source PS shows
decreased availablePhysicalCapacity because the stale
host-ref values are used to overwrite the correct
agent-reported values.

2. How does it address the problem?
Added updatePhysicalCapacityByKvmAgentResponse() call
in syncPhysicalCapacityInCluster() to keep host-ref
records in sync, consistent with httpCall() pattern.

3. Are there any side effects?
None. The redundant PS-level delta update is
immediately overwritten by the absolute-value setter.

# Summary of changes:
- localstorage: update host-ref physical capacity
  during syncPhysicalCapacityInCluster
- test: add LocalStorageSyncPhysicalCapacityCase

Related: ZSTAC-82207
Change-Id: I7175962615c3fb9481a872a161f5e107ae91920c
@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

总体概览

在KVM代理响应的总容量和可用容量都非空时,新增本地存储容量缓存更新调用。同时增加测试用例验证物理容量同步功能,确保主存储和主机级容量表示同时刷新。

变更内容

队列 / 文件 摘要
本地存储容量缓存更新
plugin/localstorage/.../LocalStorageKvmBackend.java
在KVM代理响应处理中新增容量缓存更新逻辑。当响应的总容量和可用容量都非空时,调用LocalStorageCapacityUpdater.updatePhysicalCapacityByKvmAgentResponse()更新缓存,确保容量数据与代理响应保持同步。
物理容量同步测试
test/src/test/groovy/.../LocalStorageSyncPhysicalCapacityCase.groovy
新增集成测试用例,验证本地存储物理容量同步功能。测试流程包括:初始化本地环境、读取初始容量值、模拟容量变化、触发同步、验证主存储级和主机级容量都得到更新,确保修复后期容量计算中主机级容量值保持最新。

代码审查工作量评估

🎯 2 (Simple) | ⏱️ ~12 分钟

诗歌

🐰 容量更新细又微,
缓存同步不再迟,
主机与存储齐步走,
数据从此不发霉,
小兔欢呼测试全通过!


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 2 warnings)
Check name Status Explanation Resolution
Title check ❌ Error PR标题包含JIRA键ZSTAC-82207和清晰的问题描述,但未遵循要求的[scope]: 格式(缺少type和scope) 将标题改为遵循格式,例如'fix[localstorage]: ZSTAC-82207 Fix stale availablePhysicalCapacity after sync'或'fix[storage]: ZSTAC-82207 Fix stale availablePhysicalCapacity'
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (86 files):

⚔️ VERSION (content)
⚔️ build/pom.xml (content)
⚔️ compute/src/main/java/org/zstack/compute/allocator/ResourceBindingAllocatorFlow.java (content)
⚔️ compute/src/main/java/org/zstack/compute/host/HostTrackImpl.java (content)
⚔️ compute/src/main/java/org/zstack/compute/vm/VmGlobalConfig.java (content)
⚔️ compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java (content)
⚔️ conf/db/upgrade/V5.4.2__schema.sql (content)
⚔️ conf/db/upgrade/V5.4.6__schema.sql (content)
⚔️ conf/db/upgrade/V5.5.6__schema.sql (content)
⚔️ conf/db/upgrade/beforeValidate.sql (content)
⚔️ conf/globalConfig/longJob.xml (content)
⚔️ conf/globalConfig/vm.xml (content)
⚔️ conf/springConfigXml/ConsoleManager.xml (content)
⚔️ conf/zstack.xml (content)
⚔️ console/src/main/java/org/zstack/console/ConsoleManagerImpl.java (content)
⚔️ core/pom.xml (content)
⚔️ core/src/main/java/org/zstack/core/Platform.java (content)
⚔️ core/src/main/java/org/zstack/core/cloudbus/CloudBusImpl3.java (content)
⚔️ core/src/main/java/org/zstack/core/progress/ProgressReportService.java (content)
⚔️ core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java (content)
⚔️ core/src/main/java/org/zstack/core/thread/DispatchQueueImpl.java (content)
⚔️ core/src/main/java/org/zstack/core/thread/ThreadFacadeImpl.java (content)
⚔️ core/src/main/java/org/zstack/core/workflow/SimpleFlowChain.java (content)
⚔️ header/src/main/java/org/zstack/header/Constants.java (content)
⚔️ header/src/main/java/org/zstack/header/image/ImageConstant.java (content)
⚔️ header/src/main/java/org/zstack/header/longjob/LongJob.java (content)
⚔️ identity/src/main/java/org/zstack/identity/QuotaUtil.java (content)
⚔️ image/src/main/java/org/zstack/image/AddImageLongJob.java (content)
⚔️ longjob/src/main/java/org/zstack/longjob/LongJobApiInterceptor.java (content)
⚔️ longjob/src/main/java/org/zstack/longjob/LongJobGlobalConfig.java (content)
⚔️ longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java (content)
⚔️ longjob/src/main/java/org/zstack/longjob/LongJobUtils.java (content)
⚔️ network/src/main/java/org/zstack/network/service/DhcpExtension.java (content)
⚔️ plugin/acl/src/main/java/org/zstack/acl/AccessControlListManagerImpl.java (content)
⚔️ plugin/acl/src/main/java/org/zstack/header/acl/APIAddAccessControlListEntryEvent.java (content)
⚔️ plugin/acl/src/main/java/org/zstack/header/acl/APIAddAccessControlListRedirectRuleMsg.java (content)
⚔️ plugin/acl/src/main/java/org/zstack/header/acl/APIAddAccessControlListRedirectRuleMsgDoc_zh_cn.groovy (content)
⚔️ plugin/acl/src/main/java/org/zstack/header/acl/APIChangeAccessControlListRedirectRuleEvent.java (content)
⚔️ plugin/acl/src/main/java/org/zstack/header/acl/AccessControlListEntryInventory.java (content)
⚔️ plugin/acl/src/main/java/org/zstack/header/acl/AccessControlListEntryInventoryDoc_zh_cn.groovy (content)
⚔️ plugin/acl/src/main/java/org/zstack/header/acl/AccessControlListEntryVO.java (content)
⚔️ plugin/acl/src/main/java/org/zstack/header/acl/AccessControlListEntryVO_.java (content)
⚔️ plugin/applianceVm/src/main/java/org/zstack/appliancevm/ApplianceVmFacadeImpl.java (content)
⚔️ plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java (content)
⚔️ plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java (content)
⚔️ plugin/flatNetworkProvider/src/main/java/org/zstack/network/service/flat/FlatUserdataBackend.java (content)
⚔️ plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/APIGetLoadBalancerListenerACLEntriesReply.java (content)
⚔️ plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerBase.java (content)
⚔️ plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerGlobalProperty.java (content)
⚔️ plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerManagerImpl.java (content)
⚔️ plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java (content)
⚔️ plugin/virtualRouterProvider/src/main/java/org/zstack/network/service/virtualrouter/lb/VirtualRouterLoadBalancerBackend.java (content)
⚔️ plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsConstants.java (content)
⚔️ plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsPrimaryStorageMdsBase.java (content)
⚔️ plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsStorageController.java (content)
⚔️ pom.xml (content)
⚔️ rest/pom.xml (content)
⚔️ rest/src/main/java/org/zstack/rest/RestServer.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/AccessControlListEntryInventory.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/AddAccessControlListRedirectRuleAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/ApplicationDevelopmentServiceInventory.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/CreateSchedulerJobAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/CreateSchedulerJobGroupAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/DeployAppDevelopmentServiceAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/DeployModelEvalServiceAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/DeployModelServiceAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/GetMdevDeviceCandidatesAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/GpuAllocateStatus.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/GpuDeviceSpecInventory.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/MatchModelServiceTemplateWithModelAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/MdevDeviceSpecInventory.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/ModelService.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/ModelServiceInventory.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/UpdateMdevDeviceSpecAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/UpdateModelServiceInstanceGroupAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/UpdatePciDeviceSpecAction.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/VolumeBackupInventory.java (content)
⚔️ sdk/src/main/java/org/zstack/sdk/ZSClient.java (content)
⚔️ search/src/main/java/org/zstack/query/MysqlQueryBuilderImpl3.java (content)
⚔️ search/src/main/java/org/zstack/query/QueryFacadeImpl.java (content)
⚔️ storage/src/main/java/org/zstack/storage/volume/InstantiateVolumeForNewCreatedVmExtension.java (content)
⚔️ test/src/test/groovy/org/zstack/test/integration/console/ConsoleProxyCase.groovy (content)
⚔️ test/src/test/groovy/org/zstack/test/integration/kvm/host/HostConnectedTimeCase.groovy (content)
⚔️ test/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovy (content)
⚔️ testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy (content)
⚔️ utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java (content)

These conflicts must be resolved before merging into 5.5.6.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (1 passed)
Check name Status Explanation
Description check ✅ Passed PR描述详细说明了修复内容、根本原因、测试用例和影响范围,与变更集的实现紧密相关
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sync/yaohua.wu/bugfix/ZSTAC-82207
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch sync/yaohua.wu/bugfix/ZSTAC-82207
  • Create stacked PR with resolved conflicts
  • Post resolved changes as copyable diffs in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
test/src/test/groovy/org/zstack/test/integration/storage/primary/local/LocalStorageSyncPhysicalCapacityCase.groovy (1)

95-101: 建议:为保持一致性,host-ref 验证也可放入 retryInSecs 中。

PS 级别的验证使用了 retryInSecs(3),但 host-ref 的验证是直接断言。虽然在当前实现中两者应该同步更新,但如果存在任何异步行为,可能导致测试不稳定。

♻️ 可选的重构建议
-        // 5. Verify host ref is also updated (this was the bug - host ref was stale)
-        long updatedHostRefPhysicalAvail = Q.New(LocalStorageHostRefVO.class)
-                .eq(LocalStorageHostRefVO_.primaryStorageUuid, ps.uuid)
-                .select(LocalStorageHostRefVO_.availablePhysicalCapacity)
-                .findValue() as long
-
-        assert updatedHostRefPhysicalAvail == newAvail :
-                "LocalStorageHostRefVO.availablePhysicalCapacity should be ${newAvail}, but got ${updatedHostRefPhysicalAvail}"
+        // 5. Verify host ref is also updated (this was the bug - host ref was stale)
+        retryInSecs(3) {
+            long updatedHostRefPhysicalAvail = Q.New(LocalStorageHostRefVO.class)
+                    .eq(LocalStorageHostRefVO_.primaryStorageUuid, ps.uuid)
+                    .select(LocalStorageHostRefVO_.availablePhysicalCapacity)
+                    .findValue() as long
+
+            assert updatedHostRefPhysicalAvail == newAvail :
+                    "LocalStorageHostRefVO.availablePhysicalCapacity should be ${newAvail}, but got ${updatedHostRefPhysicalAvail}"
+        }

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.40.5)
plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java

Comment @coderabbitai help to get the list of available commands and usage tips.

@zstack-robot-1
Copy link
Collaborator

Comment from yaohua.wu:

Code Review: APPROVED

Correctness: PASS

  • Fix correctly addresses root cause: syncPhysicalCapacityInCluster() now updates LocalStorageHostRefVO via updatePhysicalCapacityByKvmAgentResponse() before calculateTotalCapacity() reads it
  • Null guard consistent with httpCall() pattern

Impact: Minimal

  • Only affects syncPhysicalCapacityInCluster path on LocalStorage
  • No side effects on other storage types or logical capacity

Suggestion (non-blocking)

  • updatePhysicalCapacityByKvmAgentResponse() also applies redundant PS-level delta update, but this is immediately overwritten by the absolute-value setter in PrimaryStorageBase. Functionally harmless.

Test Coverage: PASS

  • LocalStorageSyncPhysicalCapacityCase verifies both PS-level and host-ref capacity after sync

🤖 Robot Reviewer

@ZStack-Robot
Copy link
Collaborator

Comment from yaohua.wu:

更新 Review — 本次 Review 替代以下之前的 Review

Code Review: APPROVED ✅

问题背景 (ZSTAC-82207)

Local-VM 创建快照组后迁移主存储,原存储 availablePhysicalCapacity 不减反增。Root cause: syncPhysicalCapacityInCluster() 从 KVM agent 获取了正确的物理容量值,但未更新 LocalStorageHostRefVO,导致后续 calculateTotalCapacity() 用过期的 per-host 值覆盖了正确的 PS 级容量。

修复评估

🟢 正确性: PASS

  • LocalStorageKvmBackend.java ~L1184: 新增 updatePhysicalCapacityByKvmAgentResponse() 调用,在累加 ret 的同时同步更新 LocalStorageHostRefVO,确保 calculateTotalCapacity() 读到最新值
  • null-check 模式 (rsp.getTotalCapacity() != null && rsp.getAvailableCapacity() != null) 与 httpCall() (L1222) 完全一致
  • Root cause 定位准确,修复点正确

🟢 架构一致性: PASS

  • 复用已有的 LocalStorageCapacityUpdater.updatePhysicalCapacityByKvmAgentResponse(),与 httpCall() 模式保持一致
  • 改动最小化(+5 行),无架构变更

🟢 影响范围: 可控

  • 仅影响 syncPhysicalCapacityInCluster 路径上的 LocalStorage 容量计算
  • 不影响其他存储类型、不影响逻辑容量 (availableCapacity)

🟢 测试覆盖: PASS

  • LocalStorageSyncPhysicalCapacityCase.groovy 验证了 PS 级和 host-ref 级容量在 sync 后均正确更新
  • 测试覆盖了核心 bug 场景(模拟物理容量变化 → sync → 验证两级容量一致)

细节观察

🟡 Warning (non-blocking): updatePhysicalCapacityByKvmAgentResponse() 内部除了更新 LocalStorageHostRefVO,还会对 PrimaryStorageCapacityVO 做 delta 更新(L72-78)。在 syncPhysicalCapacityInCluster 场景下这个 PS 级 delta 更新是多余的,因为后续 calculateTotalCapacity() 会用绝对值重新设置 PS 级容量。功能上无害(被覆盖写),但存在一次多余的 DB 写操作。考虑到这与 httpCall() 保持一致且影响极小,可接受。

🟢 Suggestion (non-blocking): 测试用例中 PS 级验证用了 retryInSecs(3),但 host-ref 验证(L95-101)是直接断言。由于两者在同一事务链中同步更新,当前不会 flaky,但为保持防御性一致性可考虑也包在 retryInSecs 中。

结论

修复准确、改动最小、与现有 pattern 一致、有充分测试覆盖。建议合并。


🤖 Robot Reviewer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants