Skip to content

Commit 4c77cf8

Browse files
ofriwclaude
andcommitted
Add token definition and practical context to LLM fundamentals
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent f3f5b2c commit 4c77cf8

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

website/docs/understanding-the-tools/lesson-1-intro.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,18 @@ A Large Language Model is a statistical pattern matcher built on [transformer ar
3737
- **Samples from probability distributions** learned from training data
3838
- **Has zero consciousness, intent, or feelings**
3939

40+
:::tip[What's a Token?]
41+
A **token** is the atomic unit of an LLM - the "pixel" of text processing. Averages ~3-4 characters, but varies widely: common short words are single tokens (`"the"`, `"is"`), while longer or rare words split into subwords using algorithms like [Byte-Pair Encoding](https://en.wikipedia.org/wiki/Byte-pair_encoding).
42+
43+
**Why it matters:**
44+
45+
- **Cost:** LLM providers bill per token (input + output)
46+
- **Context limits:** The ~200K token window is your working memory budget
47+
- **Performance:** Token-efficient prompts = faster responses and lower costs
48+
49+
Rule of thumb: 1 token ≈ 0.75 words in English. This paragraph is ~150 tokens. A typical source file runs 3K-15K tokens.
50+
:::
51+
4052
Think of it like an incredibly sophisticated autocomplete - one that's read most of the internet and can generate convincing continuations of any text pattern it's seen before.
4153

4254
**Technical reality vs. marketing speak:**

0 commit comments

Comments
 (0)