Skip to content

Conversation

@OlivierCazade
Copy link
Collaborator

Description

Integrate with LokiStack operator to use its status conditions instead of querying the Loki status endpoint. This adds
support for detecting Loki readiness through the operator's status API when available.

Changes:

  • Add github.com/grafana/loki/operator/apis/loki dependency
  • Add LokiStackStatus field to Loki config
  • Check operator status in getLokiStatus before querying status URL
  • Prevent status URL usage when using Loki operator

Dependencies


The associated operator PR

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

@OlivierCazade
Copy link
Collaborator Author

/retest

@codecov
Copy link

codecov bot commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 0% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.91%. Comparing base (a2cf608) to head (d3abf2e).
⚠️ Report is 32 commits behind head on main.

Files with missing lines Patch % Lines
pkg/handler/loki.go 0.00% 9 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1122      +/-   ##
==========================================
- Coverage   53.60%   52.91%   -0.70%     
==========================================
  Files         205      209       +4     
  Lines       10500    10960     +460     
  Branches     1296     1391      +95     
==========================================
+ Hits         5629     5799     +170     
- Misses       4357     4611     +254     
- Partials      514      550      +36     
Flag Coverage Δ
uitests 54.95% <ø> (-0.94%) ⬇️
unittests 47.10% <0.00%> (-0.31%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pkg/config/loki.go 88.88% <ø> (ø)
pkg/handler/loki.go 38.09% <0.00%> (-1.91%) ⬇️

... and 51 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines 202 to 212
if h.Cfg.Loki.Status != nil {
for _, conditions := range h.Cfg.Loki.Status.Conditions {
if conditions.Reason == "ReadyComponents" {
if conditions.Status == "True" {
return []byte("ready"), 200, nil
}
break
}
}
return []byte("pending"), 400, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I think that would be something to do on the operator side, rather than here. One of the reason is, by embedding the whole Loki Status in the console config, we make it very dependent on any status change, even if we don't care about that change (which if I'm correct generates a plugin restart)

Having a first-pass done on the operator, to extract what we want, would avoid that.

Copy link
Member

@memodi memodi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @OlivierCazade - I used Claude to perform code review for this PR and netobserv/network-observability-operator#2142 together in combination:

Besides other comments I added from that review, one of the things it highlighted architecture wise was the impact on Pod restarts which appears reasonable: , I'd like to get your thoughts:

  Problem: The operator embeds the entire LokiStackStatus into the configmap. Any status change triggers:
  1. ConfigMap update with new digest
  2. Plugin deployment rollout (all pods restart)

  LokiStack status can change frequently during:
  - Reconciliation loops
  - Component health fluctuations
  - Scaling events
  - Transient failures

  Impact: Users may experience frequent plugin restarts and UI disconnections.

  Recommendation:
  - Monitor this behavior in testing
  - Consider only embedding specific status fields (Ready condition only)
  - Or implement status change filtering to only reconcile on meaningful changes

  Better Alternative:
  // In consoleplugin_objects.go
  type LokiStackCondition struct {
      Type   string `json:"type"`
      Status string `json:"status"`
  }

  type SimplifiedLokiStatus struct {
      Conditions []LokiStackCondition `json:"conditions"`
  }

  // Only copy the Ready condition
  if lokiStack != nil {
      for _, cond := range lokiStack.Status.Conditions {
          if cond.Type == lokiv1.ConditionReady {
              config.Loki.Status = &SimplifiedLokiStatus{
                  Conditions: []LokiStackCondition{{
                      Type:   string(cond.Type),
                      Status: string(cond.Status),
                  }},
              }
              break
          }
      }
      config.Loki.StatusURL = ""
  }

It also pointed out other missing unit test coverage:

  Missing in Console Plugin:
  - Test for status condition checking logic
  - Test for different LokiStack status states
  - Test for error responses when status URL is blocked

Comment on lines 202 to 212
if h.Cfg.Loki.Status != nil {
for _, conditions := range h.Cfg.Loki.Status.Conditions {
if conditions.Reason == "ReadyComponents" {
if conditions.Status == "True" {
return []byte("ready"), 200, nil
}
break
}
}
return []byte("pending"), 400, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from Claude review:

  Problems:
  1. Uses Reason instead of Type for condition matching
  2. Only checks one specific condition reason
  3. Doesn't check condition type properly according to LokiStack API
  4. Returns 400 (Bad Request) for "pending" - should be 503 (Service Unavailable)

  Fix Required: Based on LokiStack API, should check Type field:
  if h.Cfg.Loki.Status != nil {
      for _, condition := range h.Cfg.Loki.Status.Conditions {
          if condition.Type == lokiv1.ConditionReady {
              if condition.Status == "True" {
                  return []byte("ready"), 200, nil
              }
              break
          }
      }
      return []byte("pending"), 503, nil  // Service Unavailable, not Bad Request
  }

@openshift-ci
Copy link

openshift-ci bot commented Dec 24, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from memodi. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

  Integrate with LokiStack operator to use its status conditions instead of querying the Loki status endpoint. This adds
  support for detecting Loki readiness through the operator's status API when available.

  Changes:
  - Add github.com/grafana/loki/operator/apis/loki dependency
  - Add LokiStackStatus field to Loki config
  - Check operator status in getLokiStatus before querying status URL
  - Prevent status URL usage when using Loki operator
OlivierCazade and others added 2 commits December 24, 2025 17:31
Co-authored-by: Mehul Modi <memodi@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants