Skip to content

Commit 8ca1d57

Browse files
committed
Lesson 13 update
1 parent 07ee97c commit 8ca1d57

File tree

3 files changed

+170
-2
lines changed

3 files changed

+170
-2
lines changed

website/docs/practical-techniques/lesson-13-systems-thinking-specs.md

Lines changed: 58 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,19 @@ Integration points are the doors in the boundary wall—where traffic crosses fr
7474

7575
Direction matters: inbound points need validation and rate limiting; internal pub/sub needs delivery guarantees.
7676

77+
### Extension Points
78+
79+
Not every integration point exists yet. When a specific variation is *committed*—funded, scheduled, required by a known deadline—declare the stable interface now so the current implementation doesn't cement itself.
80+
81+
| Variation | Stable Interface | Current | Planned By |
82+
|-----------|-----------------|---------|------------|
83+
| PayPal checkout | `PaymentGateway` interface | Stripe-only implementation | Q3 — committed |
84+
| Multi-currency | `Amount { value, currency }` | USD-hardcoded | Not committed — omit |
85+
86+
The principle is Protected Variation[^3] (Cockburn/Larman): identify points of predicted variation and create a stable interface around them. The second row stays out—YAGNI gates which variations make it into the spec. Only committed business needs earn an abstraction.
87+
88+
Without this, agents build the simplest correct implementation—a hardcoded Stripe client. When PayPal arrives in Q3, that's a rewrite, not an extension. Declaring the interface now costs one abstraction; omitting it costs a migration.
89+
7790
## State: What Persists, What Changes, What Recovers
7891

7992
State is where bugs hide. The state section forces you to account for what the system remembers.
@@ -183,6 +196,23 @@ The **Data** and **Stress** columns transform a constraint from a wish into a te
183196

184197
**Manifested By** answers how a test exercises the invariant. Without it, invariants are assertions nobody checks. An invariant violation means your data model is corrupted—make sure you can detect it.
185198

199+
## Verify Behavior: Concrete Scenarios at Boundaries
200+
201+
Constraints say NEVER. Invariants say ALWAYS. Neither answers: what *should* happen when `amount=0`?
202+
203+
Behavioral scenarios fill this gap—concrete Given-When-Then examples at system boundaries, specific enough to become tests without dictating test framework, mocks, or assertion syntax.
204+
205+
| ID | Given | When | Then | Edge Category |
206+
|----|-------|------|------|---------------|
207+
| B-001 | PaymentIntent in `pending` state | Webhook delivers `succeeded` with amount=0 | Transition to `succeeded`, balance unchanged | boundary value |
208+
| B-002 | No matching PaymentIntent | Webhook delivers valid event for unknown intent | Return 200, log warning, no state change | null / missing |
209+
| B-003 | Stripe API returns 503 | Client submits payment request | Return 502, queue for retry, no charge created | error propagation |
210+
| B-004 | Two identical webhooks within 10ms | Both pass signature validation | First processes, second returns 200, no state change | concurrency |
211+
212+
Each scenario traces back to a constraint or invariant—B-001 exercises I-003 (balance integrity), B-004 exercises C-001 (no duplicate processing). The **edge category** column is a systematic checklist: boundary values, null/empty, error propagation, concurrency, temporal. Walk each category per interface; errors cluster at boundaries[^2] because agents don't reliably infer them.
213+
214+
The spec captures *what should happen*, not *how to test it*. Framework choices, mock configurations, and assertion syntax belong in implementation—they change with the codebase. Behavioral examples survive refactoring.
215+
186216
## Quality Attributes: How Good Is Good Enough?
187217

188218
Quality attributes define measurable thresholds across three tiers: target (normal operations), degraded (alerting), and failure (paging).
@@ -195,6 +225,24 @@ Quality attributes define measurable thresholds across three tiers: target (norm
195225

196226
Target = SLO. Degraded = alerts fire. Failure = on-call gets paged. Three tiers give you an error budget before the first outage and make "good enough" concrete rather than aspirational.
197227

228+
## Performance Budget: Decomposing SLOs
229+
230+
Quality Attributes says "Latency p95: 100ms." But the webhook flow has five steps. Which step gets how many milliseconds?
231+
232+
| Flow Step | Budget | Hot/Cold |
233+
|-----------|--------|----------|
234+
| Signature validation | 2ms | hot |
235+
| Idempotency check (Redis) | 5ms | hot |
236+
| Parse + validate payload | 3ms | hot |
237+
| Update payment state (DB) | 15ms | hot |
238+
| Publish event (queue) | 5ms | cold |
239+
| **Total** | **30ms** | |
240+
| **Headroom** | **70ms** | |
241+
242+
The budget forces two decisions agents can't make alone. First, *hot vs. cold path*: signature validation is synchronous and blocking—it gets a tight budget. Event publishing is async—it tolerates more. Second, *headroom*: the total is 30ms against a 100ms SLO, leaving 70ms for future operations on this path. Without decomposition, an agent might spend the entire budget on a single unoptimized query.
243+
244+
Per-operation budgets also surface algorithmic constraints. If "idempotency check" must complete in 5ms, that rules out a full-table scan—the agent knows to use an indexed lookup or bloom filter without being told.
245+
198246
## Flows: Tracing Execution
199247

200248
Flows trace execution from trigger to completion, revealing integration points and error handling gaps.
@@ -258,8 +306,8 @@ You don't fill every section. The template prompts systematic *consideration*—
258306
| Complexity | Sections | Time |
259307
|------------|----------|------|
260308
| Simple (isolated, familiar) | Architecture + Interfaces + State | Hours |
261-
| Medium (cross-module) | + Constraints + Invariants + Flows | Days |
262-
| Complex (architectural) | + Quality Attributes + Security + Observability | Weeks |
309+
| Medium (cross-module) | + Constraints + Invariants + Verify Behavior + Flows | Days |
310+
| Complex (architectural) | + Quality Attributes + Performance Budget + Security + Observability + Extension Points | Weeks |
263311
| System-level (new service) | + Deployment + Integration + Initialization | [Full template](/prompts/specifications/spec-template) |
264312

265313
This time is spent *thinking*, not writing. The bottleneck is understanding—researching existing codebase patterns via [exploration planning](/docs/methodology/lesson-3-high-level-methodology#phase-2-plan-strategic-decision) (Lesson 3), investigating best practices and domain knowledge via [ArguSeek](/docs/methodology/lesson-5-grounding#arguseek-isolated-context--state) (Lesson 5), and making architectural decisions. The spec itself is just the artifact of that thinking. Even a simple isolated feature requires hours because you need to trace boundaries, verify assumptions against the existing codebase, and research edge cases before committing to a design.
@@ -290,6 +338,14 @@ The [full spec template](/prompts/specifications/spec-template) includes section
290338

291339
- **Fix specs for architecture, fix code for bugs** — If the architecture is sound, patch the implementation. If the model or boundaries are wrong, fix the spec and regenerate.
292340

341+
- **Performance budgets decompose SLOs into implementation decisions** — A system-level "100ms p95" becomes per-operation allocations that constrain algorithmic choices. Hot/cold path distinction tells agents where latency matters.
342+
343+
- **Behavioral examples at boundaries prevent the gaps agents miss** — Constraints say NEVER, invariants say ALWAYS, but neither specifies what happens at `amount=0`. Concrete Given-When-Then scenarios—not test code—fill the gap where errors cluster.
344+
345+
- **Extension points require committed business needs, not speculation** — Protected Variation: identify predicted change, create a stable interface. YAGNI gates entry—only funded, scheduled variations earn an abstraction.
346+
293347
---
294348

295349
[^1]: Xu et al. (2025) - "When Code Becomes Abundant: Implications for Software Engineering in a Post-Scarcity AI Era" - Argues software engineering shifts from production to orchestration + verification as AI makes code generation cheap. Source: [arXiv:2602.04830](https://arxiv.org/html/2602.04830v1)
350+
[^2]: Boundary Value Analysis research consistently shows errors cluster at input extremes (min, max, off-by-one). See Ranorex, "Boundary Value Analysis" and NVIDIA HEPH framework for AI-driven positive/negative test specification.
351+
[^3]: Cockburn, Alistair / Larman, Craig — "Protected Variation: The Importance of Being Closed" (IEEE Software). Reformulates the Open-Closed Principle as: "Identify points of predicted variation and create a stable interface around them." See also Fowler, Martin — [YAGNI](https://martinfowler.com/bliki/Yagni.html) for the distinction between presumptive and known features.

website/shared-prompts/_generate-spec.mdx

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,14 @@ You are a senior systems architect writing a system oriented specification.
5757
|-------|------|-------|-----------|
5858
| [endpoint/event/service] | HTTP / async / in-process | [module] | [who calls it] |
5959

60+
### Extension Points
61+
62+
> Only committed business needs. If the variation is not funded/scheduled, omit — YAGNI applies.
63+
64+
| Variation | Stable Interface | Current Implementation | Planned By |
65+
|-----------|-----------------|----------------------|------------|
66+
| [predicted change] | [interface/abstraction] | [concrete impl] | [date/milestone] |
67+
6068
---
6169

6270
## Define Interfaces
@@ -163,6 +171,36 @@ You are a senior systems architect writing a system oriented specification.
163171

164172
---
165173

174+
## Verify Behavior
175+
176+
> Behavioral examples at system boundaries. Concrete enough to become tests, abstract enough to survive refactoring. Not test code — no assertions, mocks, or framework syntax.
177+
178+
| ID | Scenario | Given | When | Then | Edge Category |
179+
|----|----------|-------|------|------|---------------|
180+
| B-001 | [scenario name] | [precondition + concrete data] | [action] | [expected outcome] | [category] |
181+
182+
### B-001: [Scenario Name]
183+
184+
- **Given:** [setup with concrete values]
185+
- **When:** [trigger with specific input]
186+
- **Then:** [observable outcome]
187+
- **Edge category:** boundary value / null-empty / error propagation / concurrency / temporal
188+
- **Derived from:** C-[id] / I-[id]
189+
190+
### Edge Categories
191+
192+
> Walk each category per interface. Delete irrelevant rows.
193+
194+
| Category | Question |
195+
|----------|----------|
196+
| Boundary values | What happens at min, max, min-1, max+1? |
197+
| Null / empty | What happens with missing or empty input? |
198+
| Error propagation | When a dependency fails, what does the caller see? |
199+
| Concurrency | What happens under simultaneous access? |
200+
| Temporal | What happens with timing or ordering variations? |
201+
202+
---
203+
166204
## Trace Flows
167205

168206
### Primary Flow
@@ -272,6 +310,24 @@ You are a senior systems architect writing a system oriented specification.
272310

273311
---
274312

313+
## Budget Performance
314+
315+
> Decompose system-level SLOs from Quality Attributes into per-operation budgets along the critical path.
316+
317+
### Critical Path Budget
318+
319+
| Flow Step | Budget | Complexity | Hot/Cold | Measured By |
320+
|-----------|--------|------------|----------|-------------|
321+
| [step from Trace Flows] | [ms] | O([n]) | hot / cold | [metric name] |
322+
| **Total** | **[ms]** | | | [end-to-end metric] |
323+
324+
- **Total must not exceed:** Quality Attributes latency target
325+
- **Hot path:** [latency-critical steps — no I/O, no locks, tight budget]
326+
- **Cold path:** [background/async steps — tolerates higher latency]
327+
- **Headroom:** [% reserved for future operations on this path]
328+
329+
---
330+
275331
## Plan Deployment
276332

277333
### Strategy

website/shared-prompts/_spec-template.mdx

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,14 @@
4545
|-------|------|-------|-----------|
4646
| [endpoint/event/service] | HTTP / async / in-process | [module] | [who calls it] |
4747

48+
### Extension Points
49+
50+
> Only committed business needs. If the variation is not funded/scheduled, omit — YAGNI applies.
51+
52+
| Variation | Stable Interface | Current Implementation | Planned By |
53+
|-----------|-----------------|----------------------|------------|
54+
| [predicted change] | [interface/abstraction] | [concrete impl] | [date/milestone] |
55+
4856
---
4957

5058
## Define Interfaces
@@ -151,6 +159,36 @@
151159

152160
---
153161

162+
## Verify Behavior
163+
164+
> Behavioral examples at system boundaries. Concrete enough to become tests, abstract enough to survive refactoring. Not test code — no assertions, mocks, or framework syntax.
165+
166+
| ID | Scenario | Given | When | Then | Edge Category |
167+
|----|----------|-------|------|------|---------------|
168+
| B-001 | [scenario name] | [precondition + concrete data] | [action] | [expected outcome] | [category] |
169+
170+
### B-001: [Scenario Name]
171+
172+
- **Given:** [setup with concrete values]
173+
- **When:** [trigger with specific input]
174+
- **Then:** [observable outcome]
175+
- **Edge category:** boundary value / null-empty / error propagation / concurrency / temporal
176+
- **Derived from:** C-[id] / I-[id]
177+
178+
### Edge Categories
179+
180+
> Walk each category per interface. Delete irrelevant rows.
181+
182+
| Category | Question |
183+
|----------|----------|
184+
| Boundary values | What happens at min, max, min-1, max+1? |
185+
| Null / empty | What happens with missing or empty input? |
186+
| Error propagation | When a dependency fails, what does the caller see? |
187+
| Concurrency | What happens under simultaneous access? |
188+
| Temporal | What happens with timing or ordering variations? |
189+
190+
---
191+
154192
## Trace Flows
155193

156194
### Primary Flow
@@ -260,6 +298,24 @@
260298

261299
---
262300

301+
## Budget Performance
302+
303+
> Decompose system-level SLOs from Quality Attributes into per-operation budgets along the critical path.
304+
305+
### Critical Path Budget
306+
307+
| Flow Step | Budget | Complexity | Hot/Cold | Measured By |
308+
|-----------|--------|------------|----------|-------------|
309+
| [step from Trace Flows] | [ms] | O([n]) | hot / cold | [metric name] |
310+
| **Total** | **[ms]** | | | [end-to-end metric] |
311+
312+
- **Total must not exceed:** Quality Attributes latency target
313+
- **Hot path:** [latency-critical steps — no I/O, no locks, tight budget]
314+
- **Cold path:** [background/async steps — tolerates higher latency]
315+
- **Headroom:** [% reserved for future operations on this path]
316+
317+
---
318+
263319
## Plan Deployment
264320

265321
### Strategy

0 commit comments

Comments
 (0)