WIP: Massive pacemaker-based refactors #4011

nrwahl2 · 2025-12-24T05:59:45Z

No description provided.

It was added in 2007 by commit f3ec7ec. No motivation was given, the file doesn't get installed, and it appears that nothing has ever used it. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Instead of declaring it in based_io.c and re-declaring it as extern in pacemaker-based.c. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

No code changes aside from moving two function definitions to a position above based_io_init(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

No other code changes, except moving a variable declaration. The p argument is always NULL, though that was not the case when this function was created. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Also, drop the return statement after calling _exit(), and rename exit_rc to rc. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

And drop comments that seem redundant. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

But not much. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

And add Doxygen. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

It's accessible via the mainloop_child_t object, now that we're exposing its definition internally. We could use the existing mainloop_child_pid() function. However, nothing is using that yet, so I would rather deprecate it in the future. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Also use bool instead of gboolean. The --disk-writes command line option seems unnecessary, but we can deal with that later. But see T227. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

This function handles the case "try to get the CIB from the requested file first." Then we'll try the directory, and finally we'll fall back to creating an empty CIB. I want to make readCibXmlFile() shorter and reduce the number of times we're allocating and freeing variables there, which is confusing. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

The sole call sets it to "cib.xml". Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

It's always set to !preserve_status, and preserve_status is always set to the same value as stand_alone. So we can use !stand_alone in place of discard_status. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

The sole call sets it to "cib.xml". Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

They're always set to cib_root. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

cib_archive_sort() sorts directory entries in the reverse order from the one documented. The code still worked as intended, because the while loop iterated in reverse order. This commit sorts directory entries from newest to oldest and then iterates starting at index 0. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

And add Doxygen. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

...and in read_current_cib(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

And rename it to old_versions. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Both callers currently always pass a non-NULL diff argument. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

This is a static function with one caller (xml_create_patchset()), which already checks whether the doc has the dirty flag set. This lets us clearly make two guarantees: * xml_create_patchset_v2() returns non-NULL. * xml_create_patchset() doesn't change *config_changed if its return value is NULL. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

If *diff is NULL, then nothing changed. There's nothing to log, *config_changed will be false, and there's no reason to validate against the schema. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

I don't see any reason to log these. We already log the patchset at info level using pcmk__log_xml_patchset(). Unless there's something wrong with patchset creation or logging, the (pretty) patchset logs make the change logs and raw patchset logs redundant. They contain the same info. We're not going to have tracing enabled unless we're trying to troubleshoot something. Those particular trace logs aren't going to be helpful unless we're specifically trying to troubleshoot patchset creation or logging. If that's the case, then we can add logging lines and recompile as needed. Whatever the issue is with patchset creation/logging, it ought to be reproducible for developers. Also drop patchset logging in the callers. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

If we reached the dropped "if" block, then op is not PCMK__CIB_REQUEST_REPLACE. That was tested by the first "if" block. But as of Pacemaker 2.0.0, PCMK__XA_CIB_UPDATE can be true only if the op is PCMK__CIB_REQUEST_REPLACE (it's set by sync_our_cib()), unless there's an even older node in the cluster that would cause the CIB manager to run in legacy mode. (See broadcast argument of send_peer_reply() in 2.0.0.) However, rolling upgrades from versions earlier than 2.0.0 are no longer supported, so that's not a concern. Similarly, we can drop PCMK__CIB_REQUEST_APPLY_PATCH from the CRM_LOG_ASSERT() call. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

All of this is legacy code based on the legacy mode (pre-Pacemaker-1.1.12) sync process. We maintained support for that until Pacemaker 3.0.0, enabling it if any node in the cluster was running a Pacemaker earlier than 1.1.12, because we supported rolling upgrades from 1.1.11 to 2.y.z. We removed that support in 3.0.0, however. You can see a call to send_peer_reply() with broadcast=TRUE guarded by "if (cib_legacy_mode() ...)" from 1.1.13 through the latest 2.y.z. That's the PCMK__CIB_REQUEST_APPLY_PATCH/CIB_OP_APPLY_DIFF request that this dropped code was dealing with. Back then, a CIB manager itself would send an apply-patch request. pcmk_rc_diff_resync means that the source version (not the target version) in the patchset is newer than our current CIB version. So the idea was apparently that we received a sync request from a node with a newer CIB, and we didn't want to apply it before first getting the newer CIB ourselves. However, outside of legacy mode (which we dropped in 3.0.0), cibadmin is the only thing that sends an apply-patch request. cib__process_apply_diff() simply calls xml_apply_patchset() with check_version=true and returns the result. That's all we need in the CIB manager, now that we know that the request is coming from a client rather than from a peer. Failure to process a user-submitted patchset doesn't affect our ability to keep correct cluster state. We don't need to request a resync (via send_sync_request()); we can just fail, and the user can submit a new patchset with compatible versions. Since send_sync_request() is the only thing that sets sync_in_progress from 0 to a nonzero value, all of the other supporting code is dead and can be removed if we're dropping send_sync_request(). Note: this enables more changes, which will follow in upcoming commits. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

All it did immediately prior to this commit was to call cib__process_replace(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Nothing makes a "sync to one" request anymore since we dropped send_sync_request(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

We removed *_sync_to_one in a previous commit, so there is no longer any ambiguity. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

The note that "internal value that clients do not and should not care " about" is false or at best outdated. When a cluster had at least one node running Pacemaker 1.1.11 or earlier (see other recent commit messages), certain changes were "broadcast" using an apply-patch request. This request may fail with -pcmk_err_diff_resync if it came from a node with a newer CIB than ours. This error means the source version of the patchset is newer than our CIB version. Now, however, an apply-patch request can only come from cibadmin. If applying a patch fails because the patchset source version is too new, then a client should report that as an error. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

This shouldn't affect cibadmin even though it can call xml_apply_patchset(), since pcmk_rc_diff_resync and pcmk_rc_diff_failed map to the same exit code. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

It is in fact an error if we get -pcmk_err_diff_failed or -pcmk_err_diff_resync. This "ignore" goes back to commit a0cecc0. We probably treated diff-related errors as OK because of how diffs were used in legacy mode (for compatibility with Pacemaker versions 1.1.11 and earlier) -- see recent commit messages. However, this shouldn't make any difference. * Nothing returns -pcmk_err_diff_resync/pcmk_rc_diff_resync anymore. * The return code we're checking is from a CIB modify operation, which can't return a diff_failed error because it doesn't apply a patchset. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Nothing returns that code anymore. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

As discussed in several recent commit messages, PCMK__XA_CIB_UPDATE used to get set when broadcasting changes while in legacy mode. We haven't supported that since 3.0.0, so PCMK__XA_CIB_UPDATE is set only for sync requests (in sync_our_cib()). I don't see any reason to treat the return code differently for sync operations than for other modifying operations. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

In light of several recent commits (and my suspicion when I wrote the TODO comment that this commit addresses)... it feels like the most correct and consistent course of action is to refresh if we get an old_data error when applying the patchset. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

When we send a ping reply, we send the entire CIB if tracing is enabled. Then if the digest doesn't match the digest on the receiver, the receiver logs the changes between the sender's CIB and its own CIB at info level. To make this happen, we'd have to enable tracing on the sender in order to log the changes on the receiver. It's possible this could be useful, but I struggle to imagine us ever reaching that point and doing so. I'd like to get rid of what I perceive as bloat. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Beekhof added the original "legacy code" comment via commit b9f13af in 2014. However, as far as I can tell, this is not legacy code. We don't want to bump versions after we receive a CIB in response to a sync request. We want to use whatever versions are in the received CIB. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

I hate to bloat the function body even more, but I also hate how many arguments it has. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

and cib__perform_query(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

cib__process_apply_patchset() ignores the section argument, and the input data shouldn't start with a PCMK_XE_CIB element anyway (although it turns out nothing checks that currently). This is another step toward simplifying things. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

I've been testing full-CIB replacements and encountered some problems due to the early input transformation (currently in get_op_input()), particularly with XPath. When we're treating the section string as an XPath expression, it doesn't make sense to try to find it as an element of input. (That is, unless we want to do an XPath search, which we can think about if necessary.) This moves the input transformation to the operations that need it, and performs it only when NOT treating the section as XPath. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

This will make the next commit easier to review. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Previously, a CIB replace operation targeting the full CIB would fail with a segmentation fault if the cib_xpath flag were set (or if --xpath were used with cibadmin). Presumably this is because process_replace_xpath() was freeing the CIB, leaving no reference to it, and possibly dereferencing more nodes in the document as it continued processing any remaining XPath search results. Now, a full CIB replacement with XPath succeeds if the replacement CIB is well-formed and is of a newer version, and it appears to fail sanely otherwise. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

nrwahl2 added 30 commits December 21, 2025 22:38

Refactor: based: Drop cib.pam file

3be7687

It was added in 2007 by commit f3ec7ec. No motivation was given, the file doesn't get installed, and it appears that nothing has ever used it. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Declare write_cib_contents() in a header file

f60ae15

Instead of declaring it in based_io.c and re-declaring it as extern in pacemaker-based.c. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Make cib_writer static and rename to write_trigger

eb07823

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Make write_cib_contents() static

bd8dc1b

No code changes aside from moving two function definitions to a position above based_io_init(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Unindent code in write_cib_contents()

588dc13

No other code changes, except moving a variable declaration. The p argument is always NULL, though that was not the case when this function was created. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Use G_SOURCE_{CONTINUE,REMOVE} in write_cib_contents()

821b5df

Also, drop the return statement after calling _exit(), and rename exit_rc to rc. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Drop exit_code variable in write_cib_contents()

d75538b

And drop comments that seem redundant. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Don't copy the CIB in write_cib_contents()

9ca4368

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Use pid_t in write_cib_contents()

37f5fb0

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Clarify blackbox state save-and-restore somewhat

99a8aec

But not much. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Convert to standard return code in write_cib_contents()

d77704f

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Rename write_cib_contents() to write_cib_async()

2855dd8

And add Doxygen. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Best practices in cib_diskwrite_complete()

0aa2a1d

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Make cib_writes_enabled static to based_io.c

1efbe74

Also use bool instead of gboolean. The --disk-writes command line option seems unnecessary, but we can deal with that later. But see T227. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Don't compare against FALSE

9c62675

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Unindent retrieveCib()

4c2d95a

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Clarify variable names in readCibXmlFile()

74c2134

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Clarify argument names in retrieveCib()

adb6681

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Drop startCib() argument

8fc5487

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Drop readCibXmlFile() cibfile argument

a4e6c08

The sole call sets it to "cib.xml". Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Drop based_read_cib() discard_status argument

67ca4b8

It's always set to !preserve_status, and preserve_status is always set to the same value as stand_alone. So we can use !stand_alone in place of discard_status. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Drop retrieveCib() cibfile argument

4484dde

The sole call sets it to "cib.xml". Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Drop based_read_cib() and retrieveCib() dir arguments

77e7962

They're always set to cib_root. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Functionize finding and reading a backup CIB file

0842ab6

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Rename retrieveCib() to read_current_cib()

5f9575c

And add Doxygen. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Don't allocate sigfile string

304b12f

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Rename root variable to cib_xml in based_read_cib()

371d455

...and in read_current_cib(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

nrwahl2 added 29 commits January 2, 2026 03:25

Refactor: libcib: Clarify patchset_cib variable

e46cc0f

And rename it to old_versions. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: Assert on NULL diff in cib_perform_op()

ffeca59

Both callers currently always pass a non-NULL diff argument. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: goto done if *diff is NULL in cib_perform_op()

aa4df5a

If *diff is NULL, then nothing changed. There's nothing to log, *config_changed will be false, and there's no reason to validate against the schema. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: Move pcmk__xml_commit_changes() call

11dbc08

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Drop based_process_replace()

6516557

All it did immediately prior to this commit was to call cib__process_replace(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib, based: Drop cib__op_sync_to_one and related things

d68db69

Nothing makes a "sync to one" request anymore since we dropped send_sync_request(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: Rename *_sync_to_all to *_sync

44dfab6

We removed *_sync_to_one in a previous commit, so there is no longer any ambiguity. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Feature: libcrmcommon: xml_apply_patchset doesn't return diff_resync err

97b75c6

This shouldn't affect cibadmin even though it can call xml_apply_patchset(), since pcmk_rc_diff_resync and pcmk_rc_diff_failed map to the same exit code. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: based: Drop pcmk_rc_diff_resync from switch

a55e565

Nothing returns that code anymore. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

API: libcrmcommon: Deprecate pcmk_rc_diff_resync, pcmk_err_diff_resync

b7b1453

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: Drop cib_perform_op() manage_counters argument

244b1b0

I hate to bloat the function body even more, but I also hate how many arguments it has. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: Drop op arg from cib__perform_query()/cib_perform_op()

23ce5cb

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: Drop call_options arg from cib_perform_op()...

61f8ece

and cib__perform_query(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: Drop input arg from cib__perform_query/cib_perform_op

fd059dc

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: Drop section arg of cib__perform_query/cib_perform_op

da0e13c

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Refactor: libcib: Move process_replace_xpath() downward

f4e20d3

This will make the next commit easier to review. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

nrwahl2 force-pushed the nrwahl2-based branch from 84a4ef0 to 852e402 Compare January 2, 2026 12:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: Massive pacemaker-based refactors #4011

WIP: Massive pacemaker-based refactors #4011

nrwahl2 commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

WIP: Massive pacemaker-based refactors #4011

Are you sure you want to change the base?

WIP: Massive pacemaker-based refactors #4011

Conversation

nrwahl2 commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant