This project holds the configuration files for our internal Red Hat Observability Service based on Observatorium.
See our website for more information about RHOBS.
- Go
- Mage (
go install github.com/magefile/mage@latest)
- findutils (for GNU xargs)
- gnu-sed
Both can be installed using Homebrew: brew install gnu-sed findutils. Afterwards, update the SED and XARGS variables in the Makefile to use gsed and gxargs or replace them in your environment.
When a critical issue is found in production, we sometimes need to hotfix the deployed configuration without going through the full development and deployment cycle. The steps below outline the process to do so. We use production as an example, but the same steps apply to stage or any other environment in terms of process. The process can also be used for rolling out a new version of the operator if needed.
Currently, we build directly from upstream. This works for now as we are maintainers of the project, but in future we might need to fork it.
- Create and merge a PR in the upstream repository with the fix.
- Go to our Konflux fork where we build from a submodule.
- Run this workflow targeting
main. - Merge the generated PR.
- Visit quay.io to ensure the new image is built and available.
- Run
mage sync:operator thanos latest - Run
mage build:environment productionto generate the manifests for production environment.
This repository leans heavily on Mage to build various components of RHOBS. You can find the available Mage targets by running:
mage -lBecause we ship operators and their Custom Resource Definitions (CRDs) as part of our RHOBS service, we need to keep them in sync with the versions deployed in our clusters. We are further complicated by the reqirement to ship images built on Konflux so we need to maintain a mapping between upstream operator versions and our Konflux-built images.
To facilitate this, we provide a Mage target mage sync:operator that automates the synchronization process.
This allows us to keep the image versions in sync with the CRDs they support.
The target requires two parameters:
operator: The name of the operator to synchronize and should be one of (thanos).- The commit hash for the fork we want to sync to or "latest" to sync to the latest commit on the supported branch.
For thanos, this is the commit hash on https://github.com/rhobs/rhobs-konflux-thanos-operator
An example is shown below:
mage sync:operator thanos latestThis will update some internal configuration and sync the dependency in go modules.
You can now proceed to build for a specific environment using mage build:environment <env>.
This repository contains Jsonnet configuration that allows generating Kubernetes objects that compose RHOBS service and its observability.
The jsonnet files for RHOBS service can be found in services directory. In order to compose RHOBS Service we import many Jsonnet libraries from different open source repositories including kube-thanos for Thanos components, Observatorium for Observatorium, Minio, Memcached, Gubernator, Dex components, thanos-receive-controller for Thanos receive controller component, parca for Parca component, observatorium api for API component, observatorium up for up component, rules-objstore for rules-objstore component.
Currently, RHOBS components are rendered as OpenShift Templates that allows parameters. This is how we deploy to multiple clusters, sharing the same configuration core, but having different details like resources or names.
This is why there might be a gap between vanilla Observatorium and RHOBS. We have plans to resolve this gap in the future.
Running make manifests generates all required files into resources/services directory.
Some services use unified, environment-agnostic templates that can be deployed across all environments using template parameters. For example, the synthetics-api service provides a single template that works for all environments:
# Generate unified synthetics-api template
mage unified:syntheticsApi
# Generate all unified templates
mage unified:all
# List available unified templates
mage unified:list
# Deploy to different environments using parameters
oc process -f resources/services/synthetics-api-template.yaml \
-p NAMESPACE=rhobs-stage \
-p IMAGE_TAG=latest | oc apply -f -
oc process -f resources/services/synthetics-api-template.yaml \
-p NAMESPACE=rhobs-production \
-p IMAGE_TAG=v1.0.0 | oc apply -f -This approach reduces template duplication and ensures consistency across environments while maintaining deployment flexibility.
Similarly, in order to have observability (alerts, recording rules, dashboards) for our service we import mixins from various projects and compose all together in observability directory.
Running make prometheusrules grafana generates all required files into resources/observability directory.
Up-to-date list of jsonnet dependencies can be found in jsonnetfile.json. Fetching all deps is done through make vendor_jsonnet utility.
To update a dependency, normally the process would be:
make vendor_jsonnet # This installs dependencies like `jb` thanks to Bingo project.
JB=`ls $(go env GOPATH)/bin/jb-* -t | head -1`
# Updates `kube-thanos` to master and sets the new hash in `jsonnetfile.lock.json`.
$JB update https://github.com/thanos-io/kube-thanos/jsonnet/kube-thanos@main
# Update all dependancies to master and sets the new hashes in `jsonnetfile.lock.json`.
$JB updateOur deployments our managed by our Red Hat AppSRE team.
Staging: Once the PR containing the dashboard changes is merged to main it goes directly to stage environment - because the telemeter-dashboards resourceTemplate refers the main branch here.
Production: Update the commit hash ref in the saas file in the telemeterDashboards resourceTemplate, for production environment.
Use synchronize.sh to create a MR against app-interface to update dashboards.
Staging: update the commit hash ref in https://gitlab.cee.redhat.com/service/app-interface/blob/master/data/services/telemeter/cicd/saas.yaml
Production: update the commit hash ref in https://gitlab.cee.redhat.com/service/app-interface/blob/master/data/services/telemeter/cicd/saas.yaml
Jobs runs are posted in:
#sd-app-sre-info for grafana dashboards
and
#team-monitoring-info for everything else.