Skip to content

Commit 7056195

Browse files
nixpanicmergify[bot]
authored andcommitted
doc: include details for Volume Condition Reporter
Signed-off-by: Niels de Vos <ndevos@ibm.com>
1 parent 0994c20 commit 7056195

File tree

1 file changed

+100
-0
lines changed

1 file changed

+100
-0
lines changed

docs/volume-condition.md

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
# Volume Condition Reporter
2+
3+
The Volume Condition Reporter uses the [Container Storage Interface
4+
Specification's `NodeGetVolumeStats` operation][nodegetvolumestats] to detect
5+
if a PersistentVolume has an _abnormal_ condition. CSI drivers can return the
6+
condition of a volume in the `NodeVolumeStatsResponse` message.
7+
8+
## Usage
9+
10+
The Volume Condition Reporter is disabled by default. Enabling the
11+
`--enable-volume-condition` for the CSI-Addons sidecar starts the Volume
12+
Condition Reporter.
13+
14+
## Abnormal Volume Condition reporting
15+
16+
Once enabled, the healthy and abnormal volume condition is reported in the logs
17+
of the CSI-Addons sidecar, and as an Event for the PersistentVolumeClaim.
18+
19+
Users will see the Event in their Namespace, and also when they describe (with
20+
`kubectl describe ...`) the PersistentVolumeClaim.
21+
22+
### Future Enhancements
23+
24+
Additional options for reporting include:
25+
26+
- include the volume condition in the metrics (similar to [KEP-4132][k8s_kep])
27+
- generate an event for one or more of
28+
29+
1. the PersistentVolume
30+
1. the Pod that uses the PersistentVolumeClaim
31+
1. the Node where the volume condition is abnormal
32+
33+
- annotate one or more of
34+
35+
1. the PersistentVolume
36+
1. the PersistentVolumeClaim
37+
1. the Pod that uses the PersistentVolumeClaim
38+
1. the Node where the volume condition is abnormal
39+
> unlikely acceptable, needs permissions to the Node object
40+
41+
## Potential Consumers of Abnormal Volume Condition check results
42+
43+
More feedback on the reporting and recovery steps are needed, but there are
44+
potential approaches that could use the reported volume condition:
45+
46+
- [Rook](https://rook.io) is a Kubernetes Operator that is able to [Network
47+
Fence][rook_fencing] a workernode where a Ceph volume is unhealthy.
48+
49+
- [Node Problem Detector][k8s_npd] provides a generic interface for reporting
50+
problems on a node. A project like [medik8s](https://medik8s.io/) can remedy
51+
node problems once they are reported.
52+
53+
## Dependencies
54+
55+
The `NodeGetVolumeStats` operation in the current CSI Specification (v1.8.0)
56+
defines the `VolumeCondition` as an _alpha_ feature. Very few CSI-drivers seem
57+
to implement the volume condition at the moment. Drivers that implement the
58+
feature, are required to expose `VOLUME_CONDITION` as a
59+
`NodeServiceCapability`, otherwise the Volume Condition Reporter will not be
60+
able to check the condition of the volume.
61+
62+
## Required Permissions (RBAC)
63+
64+
When a Kubernetes cluster uses Role Based Access Control (RBAC) like OpenShift,
65+
the CSI-Addons sidecar requires extra permissions to check and report the
66+
volume condition.
67+
68+
```yaml
69+
---
70+
# permissions for csi-addons sidecar to create events.
71+
apiVersion: rbac.authorization.k8s.io/v1
72+
kind: ClusterRole
73+
metadata:
74+
name: csiaddons-events-editor-role
75+
rules:
76+
- apiGroups:
77+
- ""
78+
resources:
79+
- events
80+
verbs:
81+
- create
82+
- delete
83+
- get
84+
- list
85+
- patch
86+
- update
87+
- watch
88+
- apiGroups:
89+
- ""
90+
resources:
91+
- persistentvolumes
92+
- persistentvolumeclaims
93+
verbs:
94+
- get
95+
```
96+
97+
[nodegetvolumestats]: https://github.com/container-storage-interface/spec/blob/master/spec.md#nodegetvolumestats
98+
[rook_fencing]: https://rook.github.io/docs/rook/v1.12/Storage-Configuration/Block-Storage-RBD/block-storage/#handling-node-loss
99+
[k8s_npd]: https://github.com/kubernetes/node-problem-detector/
100+
[k8s_kep]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1432-volume-health-monitor/README.md

0 commit comments

Comments
 (0)