Skip to content

Comments

Default MetricStorage NetworkAttachments to ctlplane#827

Open
vkmc wants to merge 5 commits intomainfrom
OSPRH-22189/use-dataplane-nw-default-nad
Open

Default MetricStorage NetworkAttachments to ctlplane#827
vkmc wants to merge 5 commits intomainfrom
OSPRH-22189/use-dataplane-nw-default-nad

Conversation

@vkmc
Copy link
Collaborator

@vkmc vkmc commented Jan 5, 2026

Updates the MetricStorage CRD to default NetworkAttachments to ["ctlplane"]. This aligns the field with DataplaneNetwork and removes the manual requirement for users to override this in the OpenStackControlPlane CR.

Closes: OSPRH-22189

@openshift-ci openshift-ci bot requested review from jlarriba and olliewalsh January 5, 2026 12:05
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f39d1351ff2349968c081ab8464f9879

✔️ openstack-k8s-operators-content-provider SUCCESS in 57m 55s
telemetry-operator-multinode-cloudkitty FAILURE in 39m 57s
✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 1h 00m 48s
telemetry-operator-multinode-default-telemetry FAILURE in 37m 09s
functional-tests-osp18 FAILURE in 44m 19s

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/5d35876c89884b759063abdca1fee62a

✔️ openstack-k8s-operators-content-provider SUCCESS in 59m 25s
⚠️ telemetry-operator-multinode-cloudkitty SKIPPED Skipped due to failed job telemetry-openstack-meta-content-provider-master
telemetry-openstack-meta-content-provider-master FAILURE in 16m 13s
telemetry-operator-multinode-default-telemetry FAILURE in 38m 08s
⚠️ functional-tests-osp18 SKIPPED Skipped due to failed job telemetry-openstack-meta-content-provider-master

@SeanMooney
Copy link

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/8ccff3636cb34d07b2913c8f3e082d69

✔️ openstack-k8s-operators-content-provider SUCCESS in 59m 53s
telemetry-operator-multinode-cloudkitty FAILURE in 44m 16s
✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 1h 05m 50s
telemetry-operator-multinode-default-telemetry FAILURE in 39m 20s
functional-tests-osp18 FAILURE in 46m 24s

@vkmc
Copy link
Collaborator Author

vkmc commented Jan 9, 2026

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/2a9e299272f34edc9e9a641abf210883

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 18m 09s
telemetry-operator-multinode-cloudkitty FAILURE in 42m 33s
✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 1h 51m 17s
telemetry-operator-multinode-default-telemetry FAILURE in 39m 00s
functional-tests-osp18 FAILURE in 40m 56s

@vkmc
Copy link
Collaborator Author

vkmc commented Jan 13, 2026

CI issues seem unrelated to the change

@vkmc
Copy link
Collaborator Author

vkmc commented Jan 13, 2026

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/c6ee37a2bfbf41f9bd9e11f1eb695ad9

✔️ openstack-k8s-operators-content-provider SUCCESS in 56m 55s
telemetry-operator-multinode-cloudkitty FAILURE in 43m 07s
✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 1h 05m 07s
telemetry-operator-multinode-default-telemetry FAILURE in 37m 01s
functional-tests-osp18 FAILURE in 45m 48s

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: vkmc
Once this PR has been reviewed and has the lgtm label, please ask for approval from elfiesmelfie. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vkmc
Copy link
Collaborator Author

vkmc commented Jan 13, 2026

Thanks for the review Emma! I missed that

@vkmc
Copy link
Collaborator Author

vkmc commented Jan 15, 2026

Kuttl tests are failing since the NAD required for tests/default is not available

When inspecting the logs, I noticed that the user supplied namespace creation is being skipped for all tests.

I think this is an issue that should be fixed in a follow up patch.

For this PR, I added the NAD creation as a step in default/tests

@vkmc vkmc force-pushed the OSPRH-22189/use-dataplane-nw-default-nad branch from 66204d4 to 78d50f6 Compare January 15, 2026 17:44
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3d70a6fa7e824443bba7f40e0fa9e467

openstack-k8s-operators-content-provider FAILURE in 13m 16s
⚠️ telemetry-operator-multinode-cloudkitty SKIPPED Skipped due to failed job telemetry-openstack-meta-content-provider-master
telemetry-openstack-meta-content-provider-master FAILURE in 13m 00s
⚠️ telemetry-operator-multinode-default-telemetry SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ functional-tests-osp18 SKIPPED Skipped due to failed job telemetry-openstack-meta-content-provider-master

@vkmc
Copy link
Collaborator Author

vkmc commented Jan 16, 2026

recheck

@vkmc
Copy link
Collaborator Author

vkmc commented Jan 16, 2026

/retest

@vkmc vkmc force-pushed the OSPRH-22189/use-dataplane-nw-default-nad branch from 78d50f6 to 90f720d Compare January 16, 2026 10:11
@vkmc
Copy link
Collaborator Author

vkmc commented Jan 19, 2026

Testing status of Kuttl tests in main in #836 to understand better the output in this PR

For some reason, after adding the NAD in the default test, Prometheus fail to start with an storage issue

Logs in https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openstack-k8s-operators_telemetry-operator/827/pull-ci-openstack-k8s-operators-telemetry-operator-main-telemetry-operator-build-deploy-kuttl/2013291199812603904/artifacts/telemetry-operator-build-deploy-kuttl/openstack-k8s-operators-gather/artifacts/must-gather/quay-io-openstack-k8s-operators-openstack-must-gather-sha256-2af7f286b6453522975b5de70b41aecb541c915047854f0e78afc578e250b844/namespaces/telemetry-kuttl-tests/events.log

LAST SEEN   TYPE      REASON                            OBJECT                                                                                                         MESSAGE
18m         Warning   FailedScheduling                  pod/prometheus-telemetry-kuttl-metricstorage-0                                                                 0/3 nodes are available: 1 node(s) didn't match pod anti-affinity rules, 2 node(s) didn't find available persistent volumes to bind. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling.
15m         Normal    Scheduled                         pod/cloudkitty-db-sync-6h5c7                                                                                   Successfully assigned telemetry-kuttl-tests/cloudkitty-db-sync-6h5c7 to oko-11-hv7zr-master-2
14m         Normal    Scheduled                         pod/minio                                                                                                      Successfully assigned telemetry-kuttl-tests/minio to oko-11-hv7zr-master-1

If we do have a resources issue, this should be happening in main as well.

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/41dcda669a0c46b687ddcb124f3c7716

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 10m 50s
✔️ telemetry-operator-multinode-cloudkitty SUCCESS in 1h 27m 12s
✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 2h 28m 45s
✔️ telemetry-operator-multinode-default-telemetry SUCCESS in 1h 27m 57s
functional-tests-osp18 FAILURE in 2h 01m 51s

- script: |
oc apply -f deps/loki-operator.yaml
until oc api-resources | grep -q grafana; do sleep 1; done
- script: oc apply -n telemetry-kuttl-tests -f https://raw.githubusercontent.com/openstack-k8s-operators/infra-operator/main/config/samples/network_v1beta1_netconfig.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that you are applying the NetConfig configuration to the wrong namespace. I think that our tests are actually running in telemetry-kuttl-default namespace.

Maybe changing the namespace here fixes the issue you are seeing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

@vkmc vkmc Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately didn't do the trick. We set the namespace for kuttl tests in https://github.com/openstack-k8s-operators/telemetry-operator/blob/main/kuttl-test.yaml#L19. The config for individual kuttl tests https://github.com/openstack-k8s-operators/telemetry-operator/blob/main/test/kuttl/tests/default/kuttl-test.yaml#L5 is being skipped (for all tests, we have the lines

=== CONT  kuttl/harness/default
    logger.go:42: 15:54:16 | default | Ignoring deps as it does not match file name regexp: ^(\d+)-(?:[^\.]+)(?:\.yaml)?$
    logger.go:42: 15:54:16 | default | Ignoring kuttl-test.yaml as it does not match file name regexp: ^(\d+)-(?:[^\.]+)(?:\.yaml)?$
    logger.go:42: 15:54:16 | default | Ignoring output as it does not match file name regexp: ^(\d+)-(?:[^\.]+)(?:\.yaml)?$
    logger.go:42: 15:54:16 | default | Skipping creation of user-supplied namespace: telemetry-kuttl-tests

)

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/42503d098d974bc88be0b8ee96569d17

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 06m 20s
✔️ telemetry-operator-multinode-cloudkitty SUCCESS in 1h 32m 11s
✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 2h 20m 23s
✔️ telemetry-operator-multinode-default-telemetry SUCCESS in 1h 29m 05s
functional-tests-osp18 FAILURE in 1h 44m 29s

@vkmc vkmc force-pushed the OSPRH-22189/use-dataplane-nw-default-nad branch from 1910d23 to 52eb95a Compare January 27, 2026 17:42
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3a27506b34424d0f837764ee830aef99

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 47m 11s
✔️ telemetry-operator-multinode-cloudkitty SUCCESS in 1h 28m 11s
✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 2h 09m 18s
✔️ telemetry-operator-multinode-default-telemetry SUCCESS in 1h 28m 31s
functional-tests-osp18 FAILURE in 1h 50m 27s

vkmc added 2 commits February 24, 2026 16:52
Updates the MetricStorage CRD to default NetworkAttachments
to ["ctlplane"]. This aligns the field with DataplaneNetwork and
removes the manual requirement for users to override this in the
OpenStackControlPlane CR.

Closes: OSPRH-22189
vkmc added 3 commits February 24, 2026 16:52
The marker used should be // +kubebuilder:default:={"ctlplane"}
to avoid the generator treating the expresssion as a string literal.
Create the NAD for 'ctlplane' before creating the Telemetry CR

This is the new default value for metricStorage NAD, so the metricStorage
will fail to start without this CR available
This dep is required to define the NAD used by the MetricStorage
@vkmc vkmc force-pushed the OSPRH-22189/use-dataplane-nw-default-nad branch from 52eb95a to 3e6497e Compare February 24, 2026 15:52
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 24, 2026

@vkmc: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/telemetry-operator-build-deploy-kuttl 3e6497e link true /test telemetry-operator-build-deploy-kuttl

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/765431f4effc432fa8f960fabe7c370a

✔️ openstack-k8s-operators-content-provider SUCCESS in 33m 52s
✔️ telemetry-operator-multinode-cloudkitty SUCCESS in 1h 22m 46s
✔️ telemetry-openstack-meta-content-provider-master SUCCESS in 1h 44m 41s
telemetry-operator-multinode-default-telemetry RETRY_LIMIT in 11m 28s
functional-tests-osp18 RETRY_LIMIT in 11m 26s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants