You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import UShapeAttentionCurve from '@site/src/components/VisualElements/UShapeAttentionCurve';
9
8
import GroundingComparison from '@site/src/components/VisualElements/GroundingComparison';
10
-
_/}
11
9
12
10
LLMs have a fundamental limitation: they only "know" what's in their training data (frozen at a point in time) and what's in their current context window (~200K/400K tokens for Claude Sonnet 4.5 / GPT-5 respectively). Everything else is educated guessing.
13
11
@@ -19,11 +17,7 @@ This lesson covers the engineering techniques that turn agents from creative fic
19
17
20
18
**Scenario:** You're debugging an authentication bug in your API.
**Without grounding:** Generic advice like "Check your JWT validation logic"
24
-
**With grounding:** Specific fix referencing your actual code: "In src/auth/jwt.ts:45, the validateJWT() function isn't checking token expiration"
25
-
26
-
The difference is clear: ungrounded responses are generic and potentially wrong. Grounded responses reference your actual code and current best practices.
20
+
<GroundingComparison />
27
21
28
22
## RAG: Retrieval-Augmented Generation
29
23
@@ -84,7 +78,8 @@ Claude Sonnet 4.5 has a 200K token context window. In practice, you'll get relia
**The U-shaped attention curve:** Information at the **beginning** and **end** of your context gets strong attention. Information in the **middle** gets skimmed or missed entirely. It's not a bug—it's how transformer attention mechanisms work under realistic constraints.
89
84
90
85
When you retrieve documentation and code chunks directly in your orchestrator context, they rapidly fill the window with search results, pushing critical constraints into that ignored middle. A few semantic searches return 10+ code chunks each (30K tokens), web docs add more (15K tokens)—your context fills with search mechanics before research completes, and the orchestrator forgets initial constraints.
0 commit comments