Skip to content

Conversation

@ajfloeder
Copy link

Summary

Adds fence_recorder, a new Pacemaker fence agent that uses a request/response file pattern to coordinate fencing with external systems.

Use Case

This agent is designed for environments where fencing decisions need to be coordinated with external systems (e.g., storage controllers, cloud providers, or custom orchestration) that perform cleanup operations before fencing completes.

Instead of directly fencing nodes, fence_recorder:

  1. Writes a fence request to a configurable directory
  2. Waits for an external responder to process the request and write a response
  3. Returns success/failure to Pacemaker based on the response

@knet-jenkins
Copy link

knet-jenkins bot commented Dec 9, 2025

Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-646/1/input

@ajfloeder ajfloeder force-pushed the fence_recorder-new-fence-agent branch from 036e40e to 0260fdf Compare December 9, 2025 21:20
@knet-jenkins
Copy link

knet-jenkins bot commented Dec 9, 2025

Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-646/2/input

@ajfloeder ajfloeder force-pushed the fence_recorder-new-fence-agent branch from 0260fdf to 57e9a9f Compare December 16, 2025 17:21
@knet-jenkins
Copy link

knet-jenkins bot commented Dec 16, 2025

Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-646/3/input

@ofaaland
Copy link
Contributor

We need this fence agent (or something like it) for a composable storage system, we would greatly appreciate review & feedback.

@fabbione
Copy link
Member

Hey guys, the maintainer is on Xmas vacation till beginning of Jan.

@oalbrigt
Copy link
Collaborator

oalbrigt commented Jan 2, 2026

Do you have a real world use case where this would be needed?

@ajfloeder
Copy link
Author

@oalbrigt Our use case has the fence agent and the storage controller on the same node. During the critical moment of fencing (when things are broken), local files are the most reliable way for two different technology stacks (Pacemaker/Python and Kubernetes/Go) to communicate.

@oalbrigt
Copy link
Collaborator

oalbrigt commented Jan 5, 2026

What kind of information do you need to send to your storage systems before fencing?

I'm just trying to understand, and also see if there is any other way we can achieve this with what we have, or mabye add it as a feature to an existing agent.

It might also work against the logic of fencing, in that it might end up blocking the fencing of the node, which is also not a good thing as it might lead to data corruption or other issues.

@ajfloeder
Copy link
Author

ajfloeder commented Jan 5, 2026

For our scenario, we need to know the name of the node to be fenced. We also need to control when that node is unfenced, so the existence of the files becomes our auditing mechanism.
The dummy fence agent is relatively close, so I will give that some more thought.

@ajfloeder
Copy link
Author

The request/response nature of the fence_recorder allows us to have separate software perform the fence action and reliably write the fence response file to indicate the result of the fencing operation. I don't see how fence_dummy could be made to do that.

@oalbrigt
Copy link
Collaborator

oalbrigt commented Jan 6, 2026

I'll make a patch, and link it here, so you can test it and give any suggestions for improvements.

@ajfloeder
Copy link
Author

Are you open to a recorder mode on the fence_dummy agent? I'm working on that change right now.

@oalbrigt
Copy link
Collaborator

oalbrigt commented Jan 7, 2026

Yeah. That sounds good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants