Skip to content
Merged

Cc #435

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions cc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,20 +52,20 @@ cargo test --release -p posixutils-cc

The compiler supports C input via `-` for stdin, and can output intermediate representations:

- `--dump-asm` - Output assembly to stdout (instead of invoking assembler/linker)
- `-S -o -` - Output assembly to stdout (standard clang/gcc option)
- `--dump-ir` - Output IR before code generation

Examples:

```bash
# Compile from stdin, view generated assembly
echo 'int main() { return 42; }' | ./target/release/pcc - --dump-asm
echo 'int main() { return 42; }' | ./target/release/pcc - -S -o -

# View IR for a source file
./target/release/pcc myfile.c --dump-ir

# Using heredoc for multi-line test cases
./target/release/pcc - --dump-asm <<'EOF'
./target/release/pcc - -S -o - <<'EOF'
int add(int a, int b) {
return a + b;
}
Expand Down Expand Up @@ -94,24 +94,25 @@ Supported:
- Bitfields (named, unnamed, zero-width for alignment)

Not yet implemented:
- goto, longjmp, setjmp
- longjmp, setjmp
- `inline` and inlining support
- multi-register returns (for structs larger than 8 bytes)
- -fverbose-asm
- Complex initializers
- VLAs (variable-length arrays)
- _Complex and _Atomic types
- Thread-local storage, alignas, etc.
- top builtins to implement:
__builtin_expect
__builtin_clz / clzl / clzll
__builtin_ctz / ctzl / ctzll
__sync_synchronize
__sync_fetch_and_add (and maybe a couple of its siblings)
__builtin_unreachable (helps optimizations + silences some warnings)
- string interning
- DCE and other opt passes
- assembly peephole optimizations
- _Complex
- C11 Alignment Specifiers (_Alignas, _Alignof)
- C11 Thread-Local Storage (_Thread_local) and atomics (_Atomic)
- Other C11 features: _Static_assert, _Generic, _Noreturn, anonymous structs

## Known Issues

Expand Down
341 changes: 341 additions & 0 deletions cc/TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,3 +157,344 @@ ARM64:
- [C/C++11 mappings to processors](https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html)
- [LOCK prefix - x86 reference](https://www.felixcloutier.com/x86/lock)
- [AArch64 atomic access - Microsoft](https://devblogs.microsoft.com/oldnewthing/20220811-00/?p=106963)

---

### C99 Complex Numbers (`_Complex`)

#### Overview

Add C99 complex number support. The `<complex.h>` header and library functions (`cabs`, `csqrt`, etc.) are provided by the system - the compiler only needs to handle the type and arithmetic.

#### Types

| Type | Size | Layout |
|------|------|--------|
| `float _Complex` | 8 bytes | `{float real, float imag}` |
| `double _Complex` | 16 bytes | `{double real, double imag}` |
| `long double _Complex` | 32 bytes | `{long double real, long double imag}` |

#### Syntax

```c
#include <complex.h>

double _Complex z1 = 1.0 + 2.0*I; // _Complex keyword
float _Complex z2 = 3.0f - 4.0f*I; // float complex
```

**Constraints:**
- `_Complex` must combine with `float`, `double`, or `long double`
- Cannot combine with integer types (`int _Complex` is invalid)
- `complex` is a macro from `<complex.h>` expanding to `_Complex`

#### Implementation Phases

**Phase 1: Type System (`cc/types.rs`)**

1. Add `COMPLEX` modifier to `TypeModifiers`
2. Add three complex type variants or track via modifier
3. Size: 2× the base float type; Alignment: same as base type

**Phase 2: Parser (`cc/parse/parser.rs`)**

1. Add `_Complex` keyword to lexer
2. Parse as type specifier (like `long`, `unsigned`)
3. Validate: only with `float`/`double`/`long double`

**Phase 3: Linearizer - Arithmetic Expansion (`cc/linearize.rs`)**

Complex arithmetic must be expanded to real operations:

```c
// Addition: c = a + b
c.real = a.real + b.real;
c.imag = a.imag + b.imag;

// Subtraction: c = a - b
c.real = a.real - b.real;
c.imag = a.imag - b.imag;

// Multiplication: c = a * b
c.real = a.real*b.real - a.imag*b.imag;
c.imag = a.real*b.imag + a.imag*b.real;

// Division: c = a / b (simplified, production needs overflow handling)
denom = b.real*b.real + b.imag*b.imag;
c.real = (a.real*b.real + a.imag*b.imag) / denom;
c.imag = (a.imag*b.real - a.real*b.imag) / denom;
```

**Phase 4: Code Generation - ABI**

x86-64 SysV:
- `float _Complex`: returned in xmm0 (both parts packed)
- `double _Complex`: returned in xmm0 (real) + xmm1 (imag)
- Arguments: same pattern, use two SSE registers

ARM64:
- Treated as struct of two floats/doubles
- Passed/returned in two FP registers (d0+d1 or s0+s1)

**Phase 5: Member Access (Optional GCC Extensions)**

```c
double _Complex z = 1.0 + 2.0*I;
double r = __real__ z; // 1.0
double i = __imag__ z; // 2.0
__real__ z = 3.0; // Can assign to parts
```

Alternative: users can call `creal(z)` and `cimag(z)` which are library functions.

#### Implementation Order

| Order | Component | Complexity |
|-------|-----------|------------|
| 1 | Type system (`COMPLEX` modifier) | Low |
| 2 | Parser (`_Complex` keyword) | Low |
| 3 | Basic load/store of complex values | Medium |
| 4 | Addition/subtraction expansion | Medium |
| 5 | Multiplication expansion | Medium |
| 6 | Division expansion | Medium-High |
| 7 | ABI for function calls | Medium |
| 8 | `__real__`/`__imag__` extensions | Low (optional) |

#### What the Compiler Does NOT Need

- Library functions (`cabs`, `csqrt`, `cexp`, etc.) - just normal calls
- The `I` macro - defined in `<complex.h>` as `_Complex_I`
- The `complex` macro - defined in `<complex.h>` as `_Complex`

#### References

- [C99 Complex arithmetic - cppreference](https://en.cppreference.com/w/c/numeric/complex)
- [complex.h - cppreference](https://en.cppreference.com/w/c/header/complex)

---

### C11 Alignment Specifiers (`_Alignas`, `_Alignof`)

#### Overview

Add C11 alignment control to specify/query memory alignment of objects.

#### Syntax

```c
_Alignas(16) float sse_data[4]; // Align to 16-byte boundary
_Alignas(int) char c; // Align char like int
size_t align = _Alignof(double); // Query alignment (compile-time constant)
```

**Two forms:**
1. `_Alignas(constant-expression)` - explicit byte alignment (must be power of 2)
2. `_Alignas(type-name)` - align like another type

#### Restrictions

- Cannot apply to: bit-fields, `register` variables, function parameters, typedefs
- Cannot reduce alignment below natural alignment (only increase)
- When multiple `_Alignas` specifiers appear, the strictest (largest) wins
- `_Alignof` cannot be applied to function types or incomplete types

#### Implementation Phases

**Phase 1: Type System (`cc/types.rs`)**

1. Add `alignment: Option<u32>` field to relevant type structures
2. Add helpers: `alignment_of(type)`, `set_explicit_alignment()`
3. Track whether alignment is natural vs explicitly specified

**Phase 2: Parser (`cc/parse/parser.rs`)**

1. Add `_Alignas` and `_Alignof` keywords to lexer
2. Parse `_Alignas(expr)` and `_Alignas(type-name)` as declaration specifiers
3. Parse `_Alignof(type-name)` as unary expression (returns `size_t`)
4. Semantic checks: power of 2, not less than natural alignment

**Phase 3: IR Extensions (`cc/ir.rs`)**

1. Add alignment field to `Alloca` instruction for local variables
2. Add alignment attribute to global variable definitions

**Phase 4: Code Generation**

For globals:
```asm
.balign 16
symbol:
.zero 64
```

For locals:
- Adjust stack pointer with additional padding
- Use aligned stack slots

**Phase 5: Struct Layout**

- When computing struct layout, member `_Alignas` affects padding
- Struct alignment = max of all member alignments

#### Implementation Order

| Order | Component | Complexity |
|-------|-----------|------------|
| 1 | `_Alignof` operator | Easy - compile-time type query |
| 2 | `_Alignas` for globals | Easy - `.balign` directives |
| 3 | `_Alignas` for struct members | Moderate - layout changes |
| 4 | `_Alignas` for locals | Moderate - stack frame adjustments |

#### References

- [_Alignas - cppreference.com](https://en.cppreference.com/w/c/language/_Alignas.html)
- [Alignment (C11) - Microsoft Learn](https://learn.microsoft.com/en-us/cpp/c-language/alignment-c)

---

### C11 Thread-Local Storage (`_Thread_local`)

#### Overview

Add thread-local storage class specifier for per-thread variable instances.

#### Syntax

```c
_Thread_local int errno; // Each thread gets its own copy
static _Thread_local int counter; // Thread-local + file scope
extern _Thread_local int shared_tls; // Declaration of TLS from another TU
```

**Constraints:**
- Can combine with `static` or `extern`, but not `auto` or `register`
- Cannot be used on function parameters or local block-scope variables (unless `static`)
- GCC extension `__thread` is equivalent

#### TLS Memory Models

| Model | Use Case | Mechanism |
|-------|----------|-----------|
| Local-Exec | Non-preemptible in executables | TP offset is link-time constant |
| Initial-Exec | Preemptible at program start | GOT entry holds fixed TP offset |
| Local-Dynamic | Non-preemptible in shared libs | Module ID lookup, local offsets |
| General-Dynamic | Preemptible in shared libs | Full runtime lookup via `__tls_get_addr` |

For a simple compiler targeting executables, **Local-Exec** is sufficient and simplest.

#### Implementation Phases

**Phase 1: Lexer/Parser**

1. Add `_Thread_local` keyword (and `__thread` as extension)
2. Parse as storage class specifier
3. Validate: not with `auto`/`register`, not on block-scope non-static

**Phase 2: Symbol Table (`cc/symbol.rs`)**

1. Add `is_thread_local: bool` to `Symbol`
2. Track TLS storage class during declaration

**Phase 3: IR (`cc/ir.rs`)**

1. Mark global variables as TLS in IR representation
2. Add `is_tls` flag to global definitions

**Phase 4: Code Generation - x86-64**

Local-Exec model (simplest, for executables):
```asm
# Read TLS variable
movl %fs:symbol@TPOFF, %eax

# Write TLS variable
movl $42, %fs:symbol@TPOFF

# Get address of TLS variable
movq %fs:0, %rax
leaq symbol@TPOFF(%rax), %rax
```

Relocations needed:
- `R_X86_64_TPOFF32` - 32-bit signed TP offset

Section placement:
- Initialized: `.tdata` section
- Zero-initialized: `.tbss` section

**Phase 5: Code Generation - ARM64**

Local-Exec model:
```asm
# Read TLS variable (default -mtls-size=24)
mrs x0, tpidr_el0
add x0, x0, #:tprel_hi12:symbol
add x0, x0, #:tprel_lo12_nc:symbol
ldr w0, [x0]
```

Relocations needed:
- `R_AARCH64_TLSLE_ADD_TPREL_HI12`
- `R_AARCH64_TLSLE_ADD_TPREL_LO12_NC`

#### Implementation Order

| Order | Component | Complexity |
|-------|-----------|------------|
| 1 | Parser + symbol tracking | Low |
| 2 | IR representation | Low |
| 3 | x86-64 Local-Exec codegen | Medium |
| 4 | ARM64 Local-Exec codegen | Medium |
| 5 | Initial-Exec model (optional) | High |
| 6 | General-Dynamic (for DSOs) | High |

#### Complexity Assessment

TLS is **significantly more complex** than alignment:
- Requires special relocations the assembler/linker must handle
- Platform-specific thread pointer access (`%fs` on x86-64, `tpidr_el0` on ARM64)
- Multiple models depending on PIC/PIE/shared library context
- Runtime support for dynamic models (`__tls_get_addr`)

**Recommendation:** Start with Local-Exec model only, which works for simple executables and avoids runtime dependencies.

#### References

- [Thread-local storage - Wikipedia](https://en.wikipedia.org/wiki/Thread-local_storage)
- [All about thread-local storage - MaskRay](https://maskray.me/blog/2021-02-14-all-about-thread-local-storage)
- [GCC Thread-Local Documentation](https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html)
- [thread_local - cppreference.com](https://en.cppreference.com/w/c/thread/thread_local)

---

### Other C11 Features

| Feature | Description | Complexity |
|---------|-------------|------------|
| `_Static_assert(expr, msg)` | Compile-time assertion | Easy |
| `_Generic(expr, type: val, ...)` | Type-generic selection | Moderate |
| `_Noreturn` | Function attribute (never returns) | Easy |
| Anonymous structs/unions | `struct { struct { int x; }; };` | Moderate |
| `<stdalign.h>` | `alignas`/`alignof` macros | Preprocessor only |

#### `_Static_assert` Implementation

```c
_Static_assert(sizeof(int) == 4, "int must be 32 bits");
```

1. Parser: recognize keyword, parse `(constant-expr, string-literal)`
2. Semantic: evaluate expression at compile time
3. If false: emit error with the string message
4. No codegen needed - purely compile-time

#### `_Noreturn` Implementation

```c
_Noreturn void exit(int status);
```

1. Parser: recognize as function specifier
2. Type system: mark function type as noreturn
3. Semantic: warn if function can return
4. Codegen: can omit function epilogue, enable optimizations
Loading