Commit 157deee
feat(telco-kpi): add lock-free job preemption based on OCP version priority (#72894)
Problem:
Multiple Telco KPI Prow jobs compete for the same baremetal host. Lower OCP
version jobs can block higher version jobs for extended periods, delaying
critical testing for newer releases.
Solution:
Implement a lock-free preemption mechanism where higher OCP version jobs can
signal lower version jobs to quit, freeing the baremetal host sooner.
How it works:
1. WAITING PHASE: Each job creates a unique waiting file on the bastion BEFORE
attempting to acquire the lock: <lock>.waiting.<nanosecond_timestamp>.<ocp_version>
Example: spoke-baremetal-50-7c-6f-5c-47-8c.lock.waiting.1766568440841947242.4.22
This ensures the job's presence is visible even if it immediately gets the lock.
2. LOCK ACQUISITION: When a job acquires the lock, it checks for higher priority
waiters BEFORE removing its own waiting file (deferred deletion). If higher
priority found, it releases the lock and keeps its waiting file for retry.
Only when no higher priority is found does it remove its waiting file.
3. PERIODIC CHECKS: While holding the lock, the job periodically checks for
higher priority waiters at key points:
- cluster-install: every QUIT_CHECK_INTERVAL iterations (default: 3)
- oslat test: before running tests
- cpu-util test: before running tests
4. QUIT MODES:
- 'graceful' (exit 0): Used by test steps. Allows remaining steps like
PTP reporting to complete. Job exits cleanly.
- 'force' (exit 1): Used by cluster-install. If installation is interrupted,
remaining steps are meaningless. Job aborts immediately.
5. CLEANUP: Each job always removes its own waiting file during cleanup,
regardless of whether it acquired the lock. This prevents orphaned files.
Priority logic:
- ONLY the OCP version determines priority (e.g., 4.22 > 4.20)
- The nanosecond timestamp is NOT used for priority decisions
- Same-version jobs (e.g., two 4.22 jobs) compete equally for the lock
- Timestamp is used solely for: (1) unique filenames, (2) self-cleanup
Key benefits:
- Lock-free: No shared mutable state, each job manages its own file
- Race-safe: Nanosecond timestamps ensure unique filenames
- Deferred deletion: Waiting file persists until validation passes
- Self-cleaning: Jobs clean up only their own files
- Configurable: QUIT_CHECK_INTERVAL controls check frequency
New shared functions:
- extract_ocp_version: Gets version from JOB_NAME
- create_waiting_request_file: Creates unique waiting file
- remove_own_waiting_file: Removes job's waiting file
- check_for_higher_priority_waiter: Scans waiting files for higher version
- should_quit: Determines if quit is needed
- check_for_quit: Main entry point (supports graceful/force modes)
Signed-off-by: Carlos Cardenosa <ccardeno@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>1 parent 86776ad commit 157deee
File tree
10 files changed
+599
-18
lines changed- ci-operator/step-registry/telcov10n
- metal-single-node-spoke-kpis
- hacks
- clean-up
- deploy
- tests
- cpu-util
- oslat
- metal-single-node-spoke/cluster/install
10 files changed
+599
-18
lines changedLines changed: 33 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
| 21 | + | |
| 22 | + | |
28 | 23 | | |
29 | 24 | | |
30 | 25 | | |
| |||
48 | 43 | | |
49 | 44 | | |
50 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
51 | 69 | | |
52 | 70 | | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
53 | 78 | | |
54 | 79 | | |
55 | 80 | | |
56 | 81 | | |
57 | 82 | | |
58 | | - | |
59 | | - | |
60 | 83 | | |
61 | | - | |
62 | 84 | | |
63 | 85 | | |
64 | 86 | | |
| |||
Lines changed: 116 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
12 | 33 | | |
13 | 34 | | |
14 | 35 | | |
| |||
36 | 57 | | |
37 | 58 | | |
38 | 59 | | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
39 | 137 | | |
40 | 138 | | |
41 | 139 | | |
| |||
60 | 158 | | |
61 | 159 | | |
62 | 160 | | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
63 | 166 | | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
70 | 179 | | |
71 | 180 | | |
72 | 181 | | |
| |||
218 | 327 | | |
219 | 328 | | |
220 | 329 | | |
| 330 | + | |
221 | 331 | | |
222 | 332 | | |
223 | 333 | | |
| |||
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
247 | 247 | | |
248 | 248 | | |
249 | 249 | | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
250 | 253 | | |
251 | 254 | | |
252 | 255 | | |
253 | 256 | | |
254 | 257 | | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
255 | 261 | | |
256 | 262 | | |
257 | 263 | | |
| 264 | + | |
258 | 265 | | |
259 | 266 | | |
260 | 267 | | |
| |||
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
239 | 239 | | |
240 | 240 | | |
241 | 241 | | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
242 | 245 | | |
243 | 246 | | |
244 | 247 | | |
245 | 248 | | |
246 | 249 | | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
247 | 253 | | |
248 | 254 | | |
249 | 255 | | |
| 256 | + | |
250 | 257 | | |
251 | 258 | | |
252 | 259 | | |
| |||
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
Lines changed: 16 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
163 | 167 | | |
164 | 168 | | |
165 | 169 | | |
| |||
203 | 207 | | |
204 | 208 | | |
205 | 209 | | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
206 | 218 | | |
207 | 219 | | |
208 | 220 | | |
| |||
238 | 250 | | |
239 | 251 | | |
240 | 252 | | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
241 | 257 | | |
242 | 258 | | |
243 | 259 | | |
| |||
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
42 | 50 | | |
43 | 51 | | |
44 | 52 | | |
| |||
0 commit comments