Skip to content

Conversation

@wenningerk
Copy link
Contributor

@wenningerk wenningerk commented Jul 16, 2020

This rebases #1957
to current master & refactors it to use the generic ipc-client-api.
And it needs ClusterLabs/sbd#113
or otherwise will wait - when sbd is enabled - till eternity
to be pinged.

Make pacemakerd wait to be pinged by sbd before starting
sub-daemons. Pings further reply health-state and timestamp
of last successful check. On shutdown bring down all the
sub-daemons and wait to be polled for state by sbd before
finally exiting pacemakerd.
Add new api as not to make the xml-structure an external interface.

Added SBD_SYNC_RESOURCE_STARTUP defaulting to 'no' to
sbd config so that upgrading pacemaker without sbd doesn't lead
to pacemaker locked waiting to be pinged by sbd.

@wenningerk
Copy link
Contributor Author

retest this please

Copy link

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good

I think the crmadmin side of things makes the new model really shine. Implementing the server side is a bit complicated but a lot of that is driven by the existing apis being so different from each other.

@wenningerk wenningerk force-pushed the sync_with_sbd_universal_daemon_ipc_api_master branch 2 times, most recently from 96f98e4 to c9486ba Compare July 20, 2020 14:35
Copy link

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put the crmadmin changes in their own commit. Also we're starting a new policy, tool option changes should bump the minor-minor version of CRM_FEATURE_SET. Pacemaker will ignore it but it will give higher-level tools an easy way to know a particular option is supported.

@wenningerk
Copy link
Contributor Author

Let's put the crmadmin changes in their own commit. Also we're starting a new policy, tool option changes should bump the minor-minor version of CRM_FEATURE_SET. Pacemaker will ignore it but it will give higher-level tools an easy way to know a particular option is supported.

Guess we're doing the bump with the next release to keep the number of bumps low - right?

@wenningerk wenningerk force-pushed the sync_with_sbd_universal_daemon_ipc_api_master branch 2 times, most recently from 5d55a63 to a7b4483 Compare July 20, 2020 21:34
@kgaillot
Copy link

Guess we're doing the bump with the next release to keep the number of bumps low - right?

No, new policy is to bump early and bump often :) in case distros backport new features, so they don't get in a situation where they have a feature set but aren't sure what's actually included in it. Anyone doing backports will have to ensure that all features up to that point are backported, but that's their responsibility.

@wenningerk
Copy link
Contributor Author

Guess we're doing the bump with the next release to keep the number of bumps low - right?

No, new policy is to bump early and bump often :) in case distros backport new features, so they don't get in a situation where they have a feature set but aren't sure what's actually included in it. Anyone doing backports will have to ensure that all features up to that point are backported, but that's their responsibility.

How many bits do we have for each of the major, minor and minor-minor? ;-)

@wenningerk wenningerk force-pushed the sync_with_sbd_universal_daemon_ipc_api_master branch from a7b4483 to f35e1c2 Compare July 20, 2020 21:53
Copy link

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach

@wenningerk wenningerk force-pushed the sync_with_sbd_universal_daemon_ipc_api_master branch from 89ece49 to f68eb1f Compare July 21, 2020 18:52
@wenningerk wenningerk changed the title [WIP] Fix: sbd-integration: sync pacemakerd with sbd Fix: sbd-integration: sync pacemakerd with sbd Jul 21, 2020
Copy link

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through more closely

Make pacemakerd wait to be pinged by sbd before starting
sub-daemons. Pings further reply health-state and timestamp
of last successful check. On shutdown bring down all the
sub-daemons and wait to be polled for state by sbd before
finally exiting pacemakerd.
Add new api as not to make the xml-structure an external interface.
@wenningerk wenningerk force-pushed the sync_with_sbd_universal_daemon_ipc_api_master branch from f68eb1f to 6ce5bb0 Compare July 23, 2020 19:44
Copy link

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, i just forgot one thing

@kgaillot kgaillot merged commit 1f84fe0 into ClusterLabs:master Jul 23, 2020
if (running_with_sbd) {
crm_warn("Enabling SBD_SYNC_RESOURCE_STARTUP would (if supported "
"by your SBD version) improve reliability of "
"interworking between SBD & pacemaker.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely an awesome improvement. Thanks for the nice work!

But in here, without knowing whether the sbd version even supports the feature, isn't a warning a little too scary for users who don't know about the details? It makes them think something is really wrong here. Would a notice be good enough to deliver the information?

Similar for the warning in sbd:

https://github.com/ClusterLabs/sbd/pull/113/files#diff-cf3c5bf928db3d772ba989e1a462b501782f7e0364daf7414570b14312d90936R981-R983

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ... my perception of a warning is to point to something that might behave in an unintentional way or lead to sub-optimal behavior ...
And I was fearing that people wouldn't have a look at it if it wasn't at least a warning when doing an upgrade from a previous configuration.
Anyway we should encourage setting the default value to 'yes' (the real default not the value it is set to in the template config) downstream taking care that it is in sync between pacemaker & sbd via package dependencies.
So I can reduce that to a notice together with the next PRs that are touching the topic.

Copy link
Member

@gao-yan gao-yan Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we could've bumped the main version of libcrmcommon to make sure sbd built against it would only be able to run with a pacemaker that supports the feature, so that the default value of the parameter at build time could've been "auto".

Anyway indeed for the time being, downstream needs to take good care of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants