|
| 1 | +--- |
| 2 | +sidebar_position: 2 |
| 3 | +sidebar_label: 'Lesson 2: How LLMs Work' |
| 4 | +--- |
| 5 | + |
| 6 | +# How LLMs Generate Code |
| 7 | + |
| 8 | +Before diving into workflows and techniques, you need to understand how your AI coding assistant actually behaves. This isn't about neural network architecture—it's about the behavioral quirks that will impact your daily work. |
| 9 | + |
| 10 | +## Learning Objectives |
| 11 | + |
| 12 | +- Understand LLM behavior patterns that affect code generation |
| 13 | +- Recognize when and why hallucinations occur |
| 14 | +- Build mental models for effective AI interaction |
| 15 | +- Know what to verify and when |
| 16 | + |
| 17 | +## The Core: Token Prediction |
| 18 | + |
| 19 | +**An LLM is sophisticated autocomplete.** That's it. |
| 20 | + |
| 21 | +It predicts the next token (roughly a word or code symbol) based on statistical patterns from millions of training examples. When you prompt it with: |
| 22 | + |
| 23 | +```python |
| 24 | +def calculate_ |
| 25 | +``` |
| 26 | + |
| 27 | +It generates a probability distribution: `total` (32%), `sum` (28%), `price` (15%), `average` (12%)... |
| 28 | + |
| 29 | +Then samples from that distribution to produce output. |
| 30 | + |
| 31 | +### Why This Matters |
| 32 | + |
| 33 | +This prediction mechanism creates specific behavioral patterns you'll encounter every day: |
| 34 | + |
| 35 | +## 5 Behavioral Facts That Impact Your Workflow |
| 36 | + |
| 37 | +### 1. Pattern Matching ≠ Reasoning |
| 38 | + |
| 39 | +The model recognizes patterns from training data. It doesn't reason about correctness. |
| 40 | + |
| 41 | +**Example:** |
| 42 | +```typescript |
| 43 | +// Prompt: "Add error handling to this function" |
| 44 | +async function fetchUser(id: string) { |
| 45 | + const response = await fetch(`/api/users/${id}`); |
| 46 | + return response.json(); |
| 47 | +} |
| 48 | + |
| 49 | +// AI might generate: |
| 50 | +async function fetchUser(id: string) { |
| 51 | + try { |
| 52 | + const response = await fetch(`/api/users/${id}`); |
| 53 | + return response.json(); |
| 54 | + } catch (error) { |
| 55 | + console.log('Error:', error); // ❌ Logs but doesn't handle |
| 56 | + return null; // ❌ Type-unsafe |
| 57 | + } |
| 58 | +} |
| 59 | +``` |
| 60 | + |
| 61 | +**The pattern looks right** (try-catch), but the logic is wrong (silent failures, incorrect return type). |
| 62 | + |
| 63 | +**Implication:** Review for correctness, not just syntax. The code will *look* professional. |
| 64 | + |
| 65 | +### 2. Non-Deterministic Output |
| 66 | + |
| 67 | +Same prompt generates different code on each run (controlled by `temperature` parameter). |
| 68 | + |
| 69 | +**Try it:** |
| 70 | +1. Ask: "Write a function to validate email addresses" |
| 71 | +2. Ask again (same prompt) |
| 72 | +3. Compare the two implementations |
| 73 | + |
| 74 | +You'll get different regex patterns, different validation logic, different function signatures. |
| 75 | + |
| 76 | +**Implication:** If you don't like the first answer, regenerate. Treat it like a junior developer—sometimes their second attempt is better. |
| 77 | + |
| 78 | +### 3. Hallucinations Are Inevitable |
| 79 | + |
| 80 | +The model will confidently generate: |
| 81 | +- Functions from libraries that don't exist |
| 82 | +- APIs with wrong signatures |
| 83 | +- Deprecated methods as if they're current |
| 84 | +- Plausible documentation for fake features |
| 85 | + |
| 86 | +**Real example:** |
| 87 | +```python |
| 88 | +# Prompt: "Use pandas to remove outliers" |
| 89 | +import pandas as pd |
| 90 | + |
| 91 | +df = df.remove_outliers(method='iqr', threshold=1.5) # ❌ No such method |
| 92 | +``` |
| 93 | + |
| 94 | +`remove_outliers()` doesn't exist in pandas. But it *should*, and the AI has seen similar patterns, so it hallucinates it. |
| 95 | + |
| 96 | +**Implication:** |
| 97 | +- **Always verify imports** - Check the actual library docs |
| 98 | +- **Always verify API signatures** - Don't trust method names or parameters |
| 99 | +- **Higher risk with newer libraries** - Less training data = more hallucinations |
| 100 | + |
| 101 | +### 4. Limited Context Window |
| 102 | + |
| 103 | +The model only sees recent conversation history (typically ~200K tokens ≈ 150K words of code/text). |
| 104 | + |
| 105 | +**Consequences:** |
| 106 | +- Can't see your entire codebase |
| 107 | +- Forgets earlier conversation after many exchanges |
| 108 | +- No memory across sessions |
| 109 | + |
| 110 | +**Example:** |
| 111 | +``` |
| 112 | +You (start of session): "We're using React 19 with the new `use` hook" |
| 113 | +[... 50 messages of conversation ...] |
| 114 | +You: "Add a data fetching hook" |
| 115 | +AI: [Generates React 18 pattern, forgot you said React 19] |
| 116 | +``` |
| 117 | + |
| 118 | +**Implication:** |
| 119 | +- Re-state critical context when starting new conversations |
| 120 | +- For large refactors, break into smaller, focused sessions |
| 121 | +- Don't assume it remembers constraints from 30 messages ago |
| 122 | + |
| 123 | +### 5. No Execution Feedback (Without Tools) |
| 124 | + |
| 125 | +Base LLMs don't compile or run code. They don't know if their output works. |
| 126 | + |
| 127 | +Modern coding agents (like Cursor, Aider, Claude Code) *do* have execution tools, but the core LLM still doesn't inherently know. |
| 128 | + |
| 129 | +**Implication:** Close the feedback loop yourself: |
| 130 | +1. Generate code |
| 131 | +2. Run/compile/test |
| 132 | +3. Paste errors back |
| 133 | +4. Let it fix based on actual failures |
| 134 | + |
| 135 | +This iteration is where AI really shines—rapid error fixing when you provide concrete feedback. |
| 136 | + |
| 137 | +## Practical Mental Model |
| 138 | + |
| 139 | +Think of an LLM as a **extremely well-read intern** who: |
| 140 | + |
| 141 | +- ✅ Has read every Stack Overflow post and GitHub repo |
| 142 | +- ✅ Can quickly draft code that matches common patterns |
| 143 | +- ✅ Works 24/7 and responds instantly |
| 144 | +- ❌ Doesn't verify their work compiles |
| 145 | +- ❌ Sometimes confidently cites documentation that doesn't exist |
| 146 | +- ❌ Forgets context from last week (or even yesterday) |
| 147 | +- ❌ Can't reason about business logic without examples |
| 148 | + |
| 149 | +**You wouldn't merge their PR without review.** Same applies here. |
| 150 | + |
| 151 | +## Interactive Example: Seeing Probabilities |
| 152 | + |
| 153 | +While we can't see the actual probability distributions, you can observe non-determinism: |
| 154 | + |
| 155 | +```javascript live |
| 156 | +function TokenPredictionDemo() { |
| 157 | + const [outputs, setOutputs] = React.useState([]); |
| 158 | + |
| 159 | + const generateCompletion = () => { |
| 160 | + // Simulating different completions for the same prompt |
| 161 | + const possibilities = [ |
| 162 | + 'total', |
| 163 | + 'sum', |
| 164 | + 'average', |
| 165 | + 'result', |
| 166 | + 'value' |
| 167 | + ]; |
| 168 | + const choice = possibilities[Math.floor(Math.random() * possibilities.length)]; |
| 169 | + setOutputs([...outputs, choice]); |
| 170 | + }; |
| 171 | + |
| 172 | + return ( |
| 173 | + <div> |
| 174 | + <p><code>def calculate_</code> → <strong>?</strong></p> |
| 175 | + <button onClick={generateCompletion} style={{padding: '8px 16px', marginBottom: '10px'}}> |
| 176 | + Generate Next Token |
| 177 | + </button> |
| 178 | + <div> |
| 179 | + {outputs.map((output, i) => ( |
| 180 | + <div key={i}>Attempt {i + 1}: <code>calculate_{output}</code></div> |
| 181 | + ))} |
| 182 | + </div> |
| 183 | + </div> |
| 184 | + ); |
| 185 | +} |
| 186 | +``` |
| 187 | +
|
| 188 | +Click multiple times—same prompt, different outputs. This is temperature and sampling at work. |
| 189 | +
|
| 190 | +## What About Visual Explanations? |
| 191 | +
|
| 192 | +If you want deeper intuition about how transformers work: |
| 193 | +
|
| 194 | +- **3Blue1Brown**: ["But what is a GPT?"](https://www.youtube.com/watch?v=wjZofJX0v4M) - Visual introduction to transformers (27 min) |
| 195 | +- **TensorFlow Playground**: [Interactive neural network](https://playground.tensorflow.org/) - See how networks learn patterns |
| 196 | +- **Transformer Explainer**: Interactive visualization of token flow through layers |
| 197 | +
|
| 198 | +**But for using AI coding agents effectively, you don't need to understand attention mechanisms or backpropagation.** The 5 behavioral facts above are what matter. |
| 199 | +
|
| 200 | +## Hands-On Exercise: Spotting Hallucinations |
| 201 | +
|
| 202 | +**Scenario:** You need to integrate with the Stripe payment API. You're not familiar with the latest Stripe SDK. |
| 203 | +
|
| 204 | +**Your Task:** |
| 205 | +
|
| 206 | +1. Prompt your AI assistant: "Write a TypeScript function to create a Stripe payment intent for $50 USD" |
| 207 | +2. Review the generated code |
| 208 | +3. Open the [actual Stripe docs](https://docs.stripe.com/api/payment_intents/create) in parallel |
| 209 | +4. Identify discrepancies: |
| 210 | + - Wrong parameter names? |
| 211 | + - Missing required fields? |
| 212 | + - Deprecated methods? |
| 213 | + - Invented convenience methods? |
| 214 | +
|
| 215 | +**Expected Findings:** The AI will likely get 70-80% correct, but might: |
| 216 | +- Use slightly wrong parameter names (`payment_method` vs `paymentMethod`) |
| 217 | +- Miss required fields like `currency` |
| 218 | +- Use a non-existent helper method |
| 219 | +- Generate outdated API patterns |
| 220 | +
|
| 221 | +**Key Lesson:** Even for well-documented, popular APIs, hallucinations happen. Always verify against source documentation. |
| 222 | +
|
| 223 | +**Bonus Challenge:** Try the same prompt with a newer, less popular library (e.g., Hono, Effect-TS). Notice more hallucinations? Less training data = less reliable patterns. |
| 224 | +
|
| 225 | +## Key Takeaways |
| 226 | +
|
| 227 | +1. **LLMs predict tokens statistically** - They don't reason or verify correctness |
| 228 | +2. **Non-determinism is a feature** - Regenerate if first output isn't great |
| 229 | +3. **Hallucinations are inevitable** - Always verify imports, APIs, and library calls |
| 230 | +4. **Context is limited and temporary** - Re-state important constraints |
| 231 | +5. **Close the feedback loop** - Paste errors back for rapid iteration |
| 232 | +
|
| 233 | +**Rule of thumb:** Treat AI output like code review for a smart junior developer. It's often 80% correct, fast to generate, but needs verification. |
| 234 | +
|
| 235 | +--- |
| 236 | +
|
| 237 | +**Next:** Lesson 3: Mental Models for AI Collaboration (coming soon) |
0 commit comments