You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/docs/practical-techniques/lesson-13-systems-thinking-specs.md
+58-2Lines changed: 58 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -74,6 +74,19 @@ Integration points are the doors in the boundary wall—where traffic crosses fr
74
74
75
75
Direction matters: inbound points need validation and rate limiting; internal pub/sub needs delivery guarantees.
76
76
77
+
### Extension Points
78
+
79
+
Not every integration point exists yet. When a specific variation is *committed*—funded, scheduled, required by a known deadline—declare the stable interface now so the current implementation doesn't cement itself.
80
+
81
+
| Variation | Stable Interface | Current | Planned By |
The principle is Protected Variation[^3] (Cockburn/Larman): identify points of predicted variation and create a stable interface around them. The second row stays out—YAGNI gates which variations make it into the spec. Only committed business needs earn an abstraction.
87
+
88
+
Without this, agents build the simplest correct implementation—a hardcoded Stripe client. When PayPal arrives in Q3, that's a rewrite, not an extension. Declaring the interface now costs one abstraction; omitting it costs a migration.
89
+
77
90
## State: What Persists, What Changes, What Recovers
78
91
79
92
State is where bugs hide. The state section forces you to account for what the system remembers.
@@ -183,6 +196,23 @@ The **Data** and **Stress** columns transform a constraint from a wish into a te
183
196
184
197
**Manifested By** answers how a test exercises the invariant. Without it, invariants are assertions nobody checks. An invariant violation means your data model is corrupted—make sure you can detect it.
185
198
199
+
## Verify Behavior: Concrete Scenarios at Boundaries
200
+
201
+
Constraints say NEVER. Invariants say ALWAYS. Neither answers: what *should* happen when `amount=0`?
202
+
203
+
Behavioral scenarios fill this gap—concrete Given-When-Then examples at system boundaries, specific enough to become tests without dictating test framework, mocks, or assertion syntax.
204
+
205
+
| ID | Given | When | Then | Edge Category |
206
+
|----|-------|------|------|---------------|
207
+
| B-001 | PaymentIntent in `pending` state | Webhook delivers `succeeded` with amount=0 | Transition to `succeeded`, balance unchanged | boundary value |
208
+
| B-002 | No matching PaymentIntent | Webhook delivers valid event for unknown intent | Return 200, log warning, no state change | null / missing |
209
+
| B-003 | Stripe API returns 503 | Client submits payment request | Return 502, queue for retry, no charge created | error propagation |
210
+
| B-004 | Two identical webhooks within 10ms | Both pass signature validation | First processes, second returns 200, no state change | concurrency |
211
+
212
+
Each scenario traces back to a constraint or invariant—B-001 exercises I-003 (balance integrity), B-004 exercises C-001 (no duplicate processing). The **edge category** column is a systematic checklist: boundary values, null/empty, error propagation, concurrency, temporal. Walk each category per interface; errors cluster at boundaries[^2] because agents don't reliably infer them.
213
+
214
+
The spec captures *what should happen*, not *how to test it*. Framework choices, mock configurations, and assertion syntax belong in implementation—they change with the codebase. Behavioral examples survive refactoring.
215
+
186
216
## Quality Attributes: How Good Is Good Enough?
187
217
188
218
Quality attributes define measurable thresholds across three tiers: target (normal operations), degraded (alerting), and failure (paging).
@@ -195,6 +225,24 @@ Quality attributes define measurable thresholds across three tiers: target (norm
195
225
196
226
Target = SLO. Degraded = alerts fire. Failure = on-call gets paged. Three tiers give you an error budget before the first outage and make "good enough" concrete rather than aspirational.
197
227
228
+
## Performance Budget: Decomposing SLOs
229
+
230
+
Quality Attributes says "Latency p95: 100ms." But the webhook flow has five steps. Which step gets how many milliseconds?
231
+
232
+
| Flow Step | Budget | Hot/Cold |
233
+
|-----------|--------|----------|
234
+
| Signature validation | 2ms | hot |
235
+
| Idempotency check (Redis) | 5ms | hot |
236
+
| Parse + validate payload | 3ms | hot |
237
+
| Update payment state (DB) | 15ms | hot |
238
+
| Publish event (queue) | 5ms | cold |
239
+
|**Total**|**30ms**||
240
+
|**Headroom**|**70ms**||
241
+
242
+
The budget forces two decisions agents can't make alone. First, *hot vs. cold path*: signature validation is synchronous and blocking—it gets a tight budget. Event publishing is async—it tolerates more. Second, *headroom*: the total is 30ms against a 100ms SLO, leaving 70ms for future operations on this path. Without decomposition, an agent might spend the entire budget on a single unoptimized query.
243
+
244
+
Per-operation budgets also surface algorithmic constraints. If "idempotency check" must complete in 5ms, that rules out a full-table scan—the agent knows to use an indexed lookup or bloom filter without being told.
245
+
198
246
## Flows: Tracing Execution
199
247
200
248
Flows trace execution from trigger to completion, revealing integration points and error handling gaps.
@@ -258,8 +306,8 @@ You don't fill every section. The template prompts systematic *consideration*—
This time is spent *thinking*, not writing. The bottleneck is understanding—researching existing codebase patterns via [exploration planning](/docs/methodology/lesson-3-high-level-methodology#phase-2-plan-strategic-decision) (Lesson 3), investigating best practices and domain knowledge via [ArguSeek](/docs/methodology/lesson-5-grounding#arguseek-isolated-context--state) (Lesson 5), and making architectural decisions. The spec itself is just the artifact of that thinking. Even a simple isolated feature requires hours because you need to trace boundaries, verify assumptions against the existing codebase, and research edge cases before committing to a design.
@@ -290,6 +338,14 @@ The [full spec template](/prompts/specifications/spec-template) includes section
290
338
291
339
-**Fix specs for architecture, fix code for bugs** — If the architecture is sound, patch the implementation. If the model or boundaries are wrong, fix the spec and regenerate.
292
340
341
+
-**Performance budgets decompose SLOs into implementation decisions** — A system-level "100ms p95" becomes per-operation allocations that constrain algorithmic choices. Hot/cold path distinction tells agents where latency matters.
342
+
343
+
-**Behavioral examples at boundaries prevent the gaps agents miss** — Constraints say NEVER, invariants say ALWAYS, but neither specifies what happens at `amount=0`. Concrete Given-When-Then scenarios—not test code—fill the gap where errors cluster.
344
+
345
+
-**Extension points require committed business needs, not speculation** — Protected Variation: identify predicted change, create a stable interface. YAGNI gates entry—only funded, scheduled variations earn an abstraction.
346
+
293
347
---
294
348
295
349
[^1]: Xu et al. (2025) - "When Code Becomes Abundant: Implications for Software Engineering in a Post-Scarcity AI Era" - Argues software engineering shifts from production to orchestration + verification as AI makes code generation cheap. Source: [arXiv:2602.04830](https://arxiv.org/html/2602.04830v1)
350
+
[^2]: Boundary Value Analysis research consistently shows errors cluster at input extremes (min, max, off-by-one). See Ranorex, "Boundary Value Analysis" and NVIDIA HEPH framework for AI-driven positive/negative test specification.
351
+
[^3]: Cockburn, Alistair / Larman, Craig — "Protected Variation: The Importance of Being Closed" (IEEE Software). Reformulates the Open-Closed Principle as: "Identify points of predicted variation and create a stable interface around them." See also Fowler, Martin — [YAGNI](https://martinfowler.com/bliki/Yagni.html) for the distinction between presumptive and known features.
@@ -163,6 +171,36 @@ You are a senior systems architect writing a system oriented specification.
163
171
164
172
---
165
173
174
+
## Verify Behavior
175
+
176
+
> Behavioral examples at system boundaries. Concrete enough to become tests, abstract enough to survive refactoring. Not test code — no assertions, mocks, or framework syntax.
177
+
178
+
| ID | Scenario | Given | When | Then | Edge Category |
> Behavioral examples at system boundaries. Concrete enough to become tests, abstract enough to survive refactoring. Not test code — no assertions, mocks, or framework syntax.
165
+
166
+
| ID | Scenario | Given | When | Then | Edge Category |
0 commit comments