-
Notifications
You must be signed in to change notification settings - Fork 350
Fix: sbd-integration: sync pacemakerd with sbd #2119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: sbd-integration: sync pacemakerd with sbd #2119
Conversation
|
retest this please |
kgaillot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good
I think the crmadmin side of things makes the new model really shine. Implementing the server side is a bit complicated but a lot of that is driven by the existing apis being so different from each other.
96f98e4 to
c9486ba
Compare
Happens when using pcmk_dispatch_ipc when dispatching without mainloop.
kgaillot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put the crmadmin changes in their own commit. Also we're starting a new policy, tool option changes should bump the minor-minor version of CRM_FEATURE_SET. Pacemaker will ignore it but it will give higher-level tools an easy way to know a particular option is supported.
Guess we're doing the bump with the next release to keep the number of bumps low - right? |
5d55a63 to
a7b4483
Compare
No, new policy is to bump early and bump often :) in case distros backport new features, so they don't get in a situation where they have a feature set but aren't sure what's actually included in it. Anyone doing backports will have to ensure that all features up to that point are backported, but that's their responsibility. |
How many bits do we have for each of the major, minor and minor-minor? ;-) |
a7b4483 to
f35e1c2
Compare
kgaillot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach
89ece49 to
f68eb1f
Compare
kgaillot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through more closely
Make pacemakerd wait to be pinged by sbd before starting sub-daemons. Pings further reply health-state and timestamp of last successful check. On shutdown bring down all the sub-daemons and wait to be polled for state by sbd before finally exiting pacemakerd. Add new api as not to make the xml-structure an external interface.
f68eb1f to
6ce5bb0
Compare
kgaillot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, i just forgot one thing
| if (running_with_sbd) { | ||
| crm_warn("Enabling SBD_SYNC_RESOURCE_STARTUP would (if supported " | ||
| "by your SBD version) improve reliability of " | ||
| "interworking between SBD & pacemaker."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely an awesome improvement. Thanks for the nice work!
But in here, without knowing whether the sbd version even supports the feature, isn't a warning a little too scary for users who don't know about the details? It makes them think something is really wrong here. Would a notice be good enough to deliver the information?
Similar for the warning in sbd:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm ... my perception of a warning is to point to something that might behave in an unintentional way or lead to sub-optimal behavior ...
And I was fearing that people wouldn't have a look at it if it wasn't at least a warning when doing an upgrade from a previous configuration.
Anyway we should encourage setting the default value to 'yes' (the real default not the value it is set to in the template config) downstream taking care that it is in sync between pacemaker & sbd via package dependencies.
So I can reduce that to a notice together with the next PRs that are touching the topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we could've bumped the main version of libcrmcommon to make sure sbd built against it would only be able to run with a pacemaker that supports the feature, so that the default value of the parameter at build time could've been "auto".
Anyway indeed for the time being, downstream needs to take good care of that.
This rebases #1957
to current master & refactors it to use the generic ipc-client-api.
And it needs ClusterLabs/sbd#113
or otherwise will wait - when sbd is enabled - till eternity
to be pinged.
Make pacemakerd wait to be pinged by sbd before starting
sub-daemons. Pings further reply health-state and timestamp
of last successful check. On shutdown bring down all the
sub-daemons and wait to be polled for state by sbd before
finally exiting pacemakerd.
Add new api as not to make the xml-structure an external interface.
Added SBD_SYNC_RESOURCE_STARTUP defaulting to 'no' to
sbd config so that upgrading pacemaker without sbd doesn't lead
to pacemaker locked waiting to be pinged by sbd.