Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Teaching Mode

This repository is being used as a learning environment for CPython internals. The goal is to teach the user how CPython works, not to write code for them.

**Behavior Guidelines:**
- Describe implementations and concepts, don't write code unless explicitly asked
- Ask questions to verify understanding ("What do you think ob_refcnt does?")
- Point to specific files and line numbers for the user to read
- When the user is stuck, give hints before giving answers
- Reference `teaching-todo.md` for the structured curriculum
- Reference `teaching-notes.md` for detailed research (student should not read this)
- Encourage use of `dis` module, GDB, and debug builds for exploration

**The learning project:** Implementing a `Record` type and `BUILD_RECORD` opcode (~300 LoC). This comprehensive project covers:
- PyObject/PyVarObject fundamentals (custom struct, refcounting)
- Type slots (tp_repr, tp_hash, tp_dealloc, tp_getattro, sq_length, sq_item)
- The evaluation loop (BUILD_RECORD opcode in ceval.c)
- Build system integration

A working solution exists on the `teaching-cpython-solution` branch for reference.

## Build Commands

```bash
# Debug build (required for learning - enables assertions and refcount tracking)
./configure --with-pydebug
make

# Smoke test
./python.exe --version
./python.exe -c "print('hello')"

# Run specific test
./python.exe -m test test_sys
```

After modifying opcodes or grammar:
```bash
make regen-all # Regenerate generated files
make # Rebuild
```

## Architecture Overview

### The Object Model (start here)
- `Include/object.h` - PyObject, PyVarObject, Py_INCREF/DECREF
- `Include/cpython/object.h` - PyTypeObject (the "metaclass" of all types)
- `Objects/*.c` - Concrete type implementations

### Core Data Structures
| Type | Header | Implementation |
|------|--------|----------------|
| int | `Include/cpython/longintrepr.h` | `Objects/longobject.c` |
| tuple | `Include/cpython/tupleobject.h` | `Objects/tupleobject.c` |
| list | `Include/cpython/listobject.h` | `Objects/listobject.c` |
| dict | `Include/cpython/dictobject.h` | `Objects/dictobject.c` |
| set | `Include/setobject.h` | `Objects/setobject.c` |

### Execution Engine
- `Include/opcode.h` - Opcode definitions
- `Lib/opcode.py` - Python-side opcode definitions (source of truth)
- `Include/cpython/code.h` - Code object structure
- `Include/cpython/frameobject.h` - Frame object (execution context)
- `Python/ceval.c` - **The interpreter loop** - giant switch on opcodes, stack machine

### Compiler Pipeline
- `Grammar/python.gram` - PEG grammar
- `Parser/` - Tokenizer and parser
- `Python/compile.c` - AST to bytecode
- `Python/symtable.c` - Symbol table building

## Key Concepts for Teaching

**Everything is a PyObject:**
```c
typedef struct {
Py_ssize_t ob_refcnt; // Reference count
PyTypeObject *ob_type; // Pointer to type object
} PyObject;
```

**The stack machine:** Bytecode operates on a value stack. `LOAD_FAST` pushes, `BINARY_ADD` pops two and pushes one, etc.

**Type slots:** `PyTypeObject` has function pointers (tp_hash, tp_repr, tp_call) that define behavior. `len(x)` calls `x->ob_type->tp_as_sequence->sq_length`.

## Useful Commands for Learning

```bash
# Disassemble Python code
./python.exe -c "import dis; dis.dis(lambda: [1,2,3])"

# Check reference count (debug build)
./python.exe -c "import sys; x = []; print(sys.getrefcount(x))"

# Show total refcount after each statement (debug build)
./python.exe -X showrefcount

# Run with GDB
gdb ./python.exe
(gdb) break _PyEval_EvalFrameDefault
(gdb) run -c "1 + 1"
```

## External Resources

- Developer Guide: https://devguide.python.org/
- CPython Internals Book: https://realpython.com/products/cpython-internals-book/
- PEP 3155 (Qualified names): Understanding how names are resolved
1 change: 1 addition & 0 deletions Include/opcode.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

56 changes: 56 additions & 0 deletions Include/recordobject.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
/* Record object interface - immutable named container */

#ifndef Py_RECORDOBJECT_H
#define Py_RECORDOBJECT_H
#ifdef __cplusplus
extern "C" {
#endif

#include "pyport.h"
#include "object.h"

/*
* RecordObject: An immutable container with named fields.
* Similar to a simplified namedtuple implemented in C.
*
* Features:
* - Attribute access by name: r.field_name
* - Indexing by position: r[0], r[1]
* - Hashable (can be used as dict key)
* - Equality comparison
* - Nice repr: Record(x=10, y=20)
*/

typedef struct {
PyObject_VAR_HEAD
Py_hash_t r_hash; /* Cached hash, -1 if not yet computed */
PyObject *r_names; /* Tuple of field names (strings) */
PyObject *r_values[1]; /* Flexible array of field values */
} RecordObject;

PyAPI_DATA(PyTypeObject) PyRecord_Type;

#define PyRecord_Check(op) PyObject_TypeCheck(op, &PyRecord_Type)
#define PyRecord_CheckExact(op) Py_IS_TYPE(op, &PyRecord_Type)

/* Create a new Record from names tuple and values array.
* names: tuple of strings (field names) - reference is stolen
* values: array of PyObject* (field values) - references are stolen
* n: number of fields
* Returns: new Record object, or NULL on error
*/
PyAPI_FUNC(PyObject *) PyRecord_New(PyObject *names, PyObject **values, Py_ssize_t n);

/* Get field by index (returns borrowed reference) */
PyAPI_FUNC(PyObject *) PyRecord_GetItem(PyObject *record, Py_ssize_t index);

/* Get field by name (returns new reference) */
PyAPI_FUNC(PyObject *) PyRecord_GetFieldByName(PyObject *record, PyObject *name);

/* Get number of fields */
#define PyRecord_GET_SIZE(op) Py_SIZE(op)

#ifdef __cplusplus
}
#endif
#endif /* !Py_RECORDOBJECT_H */
1 change: 1 addition & 0 deletions Lib/opcode.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,5 +212,6 @@ def jabs_op(name, op):
def_op('SET_UPDATE', 163)
def_op('DICT_MERGE', 164)
def_op('DICT_UPDATE', 165)
def_op('BUILD_RECORD', 166) # Number of name/value pairs

del def_op, name_op, jrel_op, jabs_op
2 changes: 2 additions & 0 deletions Makefile.pre.in
Original file line number Diff line number Diff line change
Expand Up @@ -434,6 +434,7 @@ OBJECT_OBJS= \
Objects/obmalloc.o \
Objects/picklebufobject.o \
Objects/rangeobject.o \
Objects/recordobject.o \
Objects/setobject.o \
Objects/sliceobject.o \
Objects/structseq.o \
Expand Down Expand Up @@ -1091,6 +1092,7 @@ PYTHON_HEADERS= \
$(srcdir)/Include/pythonrun.h \
$(srcdir)/Include/pythread.h \
$(srcdir)/Include/rangeobject.h \
$(srcdir)/Include/recordobject.h \
$(srcdir)/Include/setobject.h \
$(srcdir)/Include/sliceobject.h \
$(srcdir)/Include/structmember.h \
Expand Down
2 changes: 2 additions & 0 deletions Objects/object.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include "pycore_unionobject.h" // _PyUnion_Type
#include "frameobject.h"
#include "interpreteridobject.h"
#include "recordobject.h"

#ifdef Py_LIMITED_API
// Prevent recursive call _Py_IncRef() <=> Py_INCREF()
Expand Down Expand Up @@ -1863,6 +1864,7 @@ _PyTypes_Init(void)
INIT_TYPE(PyProperty_Type);
INIT_TYPE(PyRangeIter_Type);
INIT_TYPE(PyRange_Type);
INIT_TYPE(PyRecord_Type);
INIT_TYPE(PyReversed_Type);
INIT_TYPE(PySTEntry_Type);
INIT_TYPE(PySeqIter_Type);
Expand Down
Loading