fix: Set __is_system_running in check mode #172

martinpitt · 2025-06-12T05:40:26Z

Cause: systemctl is-system-running was not called in --check mode.

Consequence: The "Determine if system is booted with systemd" step failed in check mode as the __is_system_running variable was not populated.

Fix: Force calling systemctl is-system-running in check mode. It does not modify the system and the outcome is very influential for what the role does.

See failure in #109 log and #170 (comment) ff. I cannot reproduce this locally, as

tox -e qemu-ansible-2.9 -- --image-name centos-8 --log-level=debug tests/tests_default_wrapper.yml

fails in a completely different way. So throwing this at CI.

Summary by Sourcery

Force execution of systemctl is-system-running in check mode and update SSSD test playbooks to align assertions and structure with this behavior.

Bug Fixes:

Always run systemctl is-system-running with check_mode disabled to populate the __is_system_running variable.

Tests:

Restructure SSSD tests to group role invocations and checks within blocks and rename tasks for clarity.
Refactor run_sssd_tests.yml assertions to use in operators with helper string variables for list-based parameters.
Update check_sssd_with_tlog.yml assertion to explicitly check for regex match length.

sourcery-ai · 2025-06-12T05:40:30Z

Reviewer's Guide

This PR ensures that the systemctl is-system-running command always runs (even in check mode) so that __is_system_running is populated, and refactors the SSSD test suite playbooks—grouping role invocations in blocks, standardizing task names, and strengthening assertions in test tasks.

Flow diagram for task execution in check mode

graph TD
    subgraph Task Execution in set_vars.yml
        A[Start] --> B{Ansible --check mode?};
        B -- No --> E[Run 'systemctl is-system-running'];
        B -- Yes --> C{Task has 'check_mode: false'?};
        C -- No (Before PR) --> D[Skip command];
        C -- Yes (After PR) --> E;
        D --> F[Variable '__is_system_running' is not set];
        E --> G[Set '__is_system_running' from command output];
        F --> H[Later tasks fail];
        G --> I[Later tasks succeed];
    end

File-Level Changes

Change	Details	Files
Enable calling systemctl in check mode to populate __is_system_running	Set `check_mode: false` on the `systemctl is-system-running` task Keep `changed_when: false` and `failed_when: false` unchanged	`tasks/set_vars.yml`
Refactor SSSD tests to group role runs and checks under structured blocks	Rename `Run role with…` tasks to `Test role with…` for consistency Wrap each role invocation and its checks in a `block` Move `import_role` and subsequent `import_tasks` under the new blocks	`tests/tests_sssd.yml`
Convert `find()` assertions to `in` checks and introduce join helpers	Replace `.stdout.find('…')` calls with `('…' in stdout)` expressions Define helper vars (`*_str`) using `	join(', ')` for users/groups/excludes
Strengthen NSS switch regex assertion with a length check	Append `	length > 0`to the`regex_search` assertion in the NSSwitch test

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

martinpitt · 2025-06-12T05:49:42Z

The f42 failure is unrelated, it was introduced by #171 (landed red). However, that feels easy to fix, I'll have a quick look.

Running citest in the meantime to check centos 8.

martinpitt · 2025-06-12T05:49:46Z

[citest]

martinpitt · 2025-06-12T10:29:28Z

OK, so centos 8 is happy, and I fixed the types of the assertions. This was actually hiding a bug! It now fails like this:

TASK [Get file contents of sssd-session-recording.conf] ********************************************************************************
task path: /var/home/martin/upstream/lsr/tlog/tests/run_sssd_tests.yml:14
Thursday 12 June 2025  12:25:39 +0200 (0:00:00.020)       0:00:25.926 ********* 
ok: [/var/home/martin/.cache/linux-system-roles/fedora-42.qcow2] => {
    "changed": false,
    "cmd": [
        "cat",
        "/etc/sssd/conf.d/sssd-session-recording.conf"
    ],
    "delta": "0:00:00.002478",
    "end": "2025-06-12 10:25:39.917522",
    "rc": 0,
    "start": "2025-06-12 10:25:39.915044"
}

STDOUT:

#
# Ansible managed
#
# system_role:tlog

[session_recording]
scope=none
users=
groups=
exclude_users=
exclude_groups=

fatal: [/var/home/martin/.cache/linux-system-roles/fedora-42.qcow2]: FAILED! => {
    "assertion": "__tlog_conf_content.stdout.find('scope=all') >= 0",
    "changed": false,
    "evaluated_to": false
}

So the check expectes "scope=all", but reality is "scope=none". So find() returned -1, which the boolean coercion interprets as true.

tests/run_sssd_tests.yml

martinpitt · 2025-06-12T14:43:10Z

Now it should be back to failing on the scope=all/none mismatch. I didn't have time to really look into the code today, just having 3 min looks in between the sessions. @richm if you want to work on this, please feel free -- otherwise I'll have a look on the train ride back on Sunday.

richm · 2025-06-12T14:49:44Z

Now it should be back to failing on the scope=all/none mismatch. I didn't have time to really look into the code today, just having 3 min looks in between the sessions. @richm if you want to work on this, please feel free -- otherwise I'll have a look on the train ride back on Sunday.

@justin-stephenson any ideas?

justin-stephenson · 2025-06-12T15:21:58Z

Now it should be back to failing on the scope=all/none mismatch. I didn't have time to really look into the code today, just having 3 min looks in between the sessions. @richm if you want to work on this, please feel free -- otherwise I'll have a look on the train ride back on Sunday.

@justin-stephenson any ideas?

Sorry I don't quite follow, what is the issue?

By the way, below we should change the name of __tlog_conf_content to __tlog_sssd_conf_content or similar because this is about the SSSD configuration file for recorded users. Not to be confused with tlog's own configuration file in /etc/tlog/

- name: Get file contents of sssd-session-recording.conf
  command: cat {{ __tlog_sssd_session_recording_conf }}
  register: __tlog_conf_content
  changed_when: false

I can do it in a separate PR after this one is merged.

martinpitt · 2025-06-12T15:39:59Z

@richm oh, I know what's wrong -- tests/tests_sssd.yml needs to define tlog_scope_sssd and other vars for both the role invocation and importing run_sssd_tests.yml. That's better done in block:s, so requires some code shifting. I can't do it now, but tomorrow, unless you beat me to it.

richm · 2025-06-12T15:42:20Z

Now it should be back to failing on the scope=all/none mismatch. I didn't have time to really look into the code today, just having 3 min looks in between the sessions. @richm if you want to work on this, please feel free -- otherwise I'll have a look on the train ride back on Sunday.

@justin-stephenson any ideas?

Sorry I don't quite follow, what is the issue?

#172 (comment)

'So the check expectes "scope=all", but reality is "scope=none".'

but it looks like @martinpitt identified the problem: #172 (comment)

'oh, I know what's wrong -- tests/tests_sssd.yml needs to define tlog_scope_sssd and other vars for both the role invocation and importing run_sssd_tests.yml'

and he is going to fix it.

By the way, below we should change the name of __tlog_conf_content to __tlog_sssd_conf_content or similar because this is about the SSSD configuration file for recorded users. Not to be confused with tlog's own configuration file in /etc/tlog/
- name: Get file contents of sssd-session-recording.conf
  command: cat {{ __tlog_sssd_session_recording_conf }}
  register: __tlog_conf_content
  changed_when: false
I can do it in a separate PR after this one is merged.

ok - thanks

martinpitt · 2025-06-12T20:11:18Z

With the >- trick I was still getting warnings:

[WARNING]: conditional statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: 'scope={{
tlog_scope_sssd }}' in __tlog_conf_content.stdout

So I rewrote that part again.

I also fixed the sssd config test failure from above. However, with 2.19 there is now yet another failure:

TASK [Check for presence of ansible managed header, fingerprint] ***********************************************************************
task path: /var/home/martin/upstream/lsr/tlog/tests/tasks/check_header.yml:9
Thursday 12 June 2025  22:06:30 +0200 (0:00:00.466)       0:00:54.828 ********* 
[ERROR]: Task failed: Error rendering expression: 'in <string>' requires string as left operand, not CapturedExceptionMarker

Task failed.
Origin: /var/home/martin/upstream/lsr/tlog/tests/tasks/check_header.yml:9:3

7   when: not __file_content is defined
8
9 - name: Check for presence of ansible managed header, fingerprint
    ^ column 3

<<< caused by >>>

Error rendering expression: 'in <string>' requires string as left operand, not CapturedExceptionMarker
Origin: /var/home/martin/upstream/lsr/tlog/tests/tasks/check_header.yml:12:9

10   assert:
11     that:
12       - ansible_managed in content
           ^ column 9

fatal: [/var/home/martin/.cache/linux-system-roles/fedora-42.qcow2]: FAILED! => {
    "changed": false
}

MSG:

Task failed: Error rendering expression: 'in <string>' requires string as left operand, not CapturedExceptionMarker

This is really hard to understand, and feels like an Ansible bug (or it's just getting late and we had beer 😁 ). But at least the other OSes should go green.

martinpitt · 2025-06-12T20:12:03Z

[citest]

martinpitt · 2025-06-12T20:24:15Z

[citest]

martinpitt · 2025-06-12T20:33:10Z

@richm Still red due to the new (later) 2.19 failure. But this improves things, and I think this could go in already. Bedtime now.

sourcery-ai

Hey @martinpitt - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

martinpitt · 2025-06-12T20:36:38Z

Wrt. the final ansible 2.19 failure: when I do this:

- name: debug
  debug:
    var: ansible_managed
  vars:
    ansible_managed: "{{ lookup('template', 'get_ansible_managed.j2') }}"

it fails with

[ERROR]: Task failed: Error while resolving `var` expression: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: Maximum depth exceeded, omitting further events.

So something changed or broke with the template reading.

richm · 2025-06-12T22:48:46Z

Wrt. the final ansible 2.19 failure: when I do this:

- name: debug
  debug:
    var: ansible_managed
  vars:
    ansible_managed: "{{ lookup('template', 'get_ansible_managed.j2') }}"

it fails with

[ERROR]: Task failed: Error while resolving `var` expression: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: The lookup plugin 'template' failed: Maximum depth exceeded, omitting further events.

ansible_managed is a very special variable - I would say it's not really a variable

you cannot assign to it - so ansible_managed: ... is wrong - try

__ansible_managed: "{{ lookup('template', 'get_ansible_managed.j2') }}"

it only has a value in the context of a template - either the template module or lookup plugin - so you cannot evaluate it like an ordinary variable


So something changed or broke with the template reading.

richm · 2025-06-12T22:49:42Z

tests/tasks/check_header.yml will need to do something like this:

- name: Check for presence of ansible managed header, fingerprint
  assert:
    that:
      - __ansible_managed in content
      - __fingerprint in content
  vars:
    content: "{{ (__file_content | d(__content)).content | b64decode }}"
    __ansible_managed: "{{ lookup('template', 'get_ansible_managed.j2') }}"

richm · 2025-06-12T22:50:44Z

looks like ansible 2.19 was changed to make ansible_managed access much stricter - other roles have a similar check, but I don't know if they use a different intermediate variable

Cause: The test used a temporary variable `ansible_managed`, but that is a "magic" string constant. Ansible 2.19 does not permit assigning to it any more. Consequence: Tests failed with Ansible 2.19. Fix: Rename the variable.

Cause: `systemctl is-system-running` was not called in `--check` mode. Consequence: The "Determine if system is booted with systemd" step failed in check mode as the `__is_system_running` variable was not populated. Fix: Force calling `systemctl is-system-running` in check mode. It does not modify the system and the outcome is very influential for what the role does.

Cause: Ansible 2.19 requires explicit boolean types for assertions. Consequence: The test failed with Ansible 2.19 with "Conditional result was '62' of type 'int', which evaluates to True. Conditionals must have a boolean result." Fix: find() does not actually have a result which is compatible with boolean coercion, as 0 means "string found at the start", and "not found" results in -1 (which is `True`). We could compare the find() result explicitly, but let's instead use the `in` operator to make this more readable. Also rewrite these assertions to use Python string concatenation. This fixes > [WARNING]: conditional statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: 'scope={{ tlog_scope_sssd }}' in __tlog_conf_content.stdout Likewise, `regex_search()` returns a string (possibly empty for non-matches) or `none`, so explicitly check the length to convert to a bool.

Group the test cases in blocks which use the same definition of `tlog_scope_sssd` and other variables for both the role invocation and check_sssd_with_tlog.yml. Then the latter can actually check for the correct value. This was previously masked by the wrong `find()` result evaluation.

martinpitt · 2025-06-13T08:24:46Z

Thanks @richm that saved me some research this morning. This works fine!

martinpitt · 2025-06-13T08:24:51Z

[citest]

martinpitt · 2025-06-13T08:40:32Z

Whew, all green 🍀 I suppose these issues apply to more packages, so at least now we have a nice reference fix.

richm · 2025-06-13T13:42:21Z

tests/tasks/check_header.yml

  vars:
-    content: "{{ (__file_content | d(__content)).content | b64decode }}"
-    ansible_managed: "{{ lookup('template', 'get_ansible_managed.j2') }}"
+    content: "{{ __content.content | b64decode }}"


yeah - I copied this implementation from another role where I already have the file content, so I could skip the slurp - but here it isn't used

martinpitt mentioned this pull request Jun 12, 2025

tests: bootc end-to-end validation; fix ownership of sssd-session-recording.conf #170

Merged

spetrosi reviewed Jun 12, 2025

View reviewed changes