Skip to content

Conversation

@jgarzik
Copy link
Contributor

@jgarzik jgarzik commented Dec 8, 2025

No description provided.

Changes Made

  1. Extended IR Initializer enum (ir/mod.rs):
  - Added String(String) for char array string literals
  - Added Array { elem_size, total_size, elements } for array initializers
  - Added Struct { total_size, fields } for struct initializers (with field
  sizes)
  - Added SymAddr(String) for address-of symbol references

  2. Updated linearizer (ir/linearize.rs):
  - Added ast_init_to_ir() to convert AST expressions to IR Initializers
  - Added ast_init_list_to_ir() to handle {} initializer lists
  - Supports designated initializers (.field = value, [index] = value)
  - Sorts elements by offset to handle out-of-order designators
  - Updated both linearize_global_decl() and linearize_static_local()

  3. Added new directive (arch/lir.rs):
  - Added Ascii(String) for non-null-terminated strings
  - Added QuadSym(Symbol) for 64-bit symbol address relocations

  4. Updated codegen (both arch/x86_64/codegen.rs and
  arch/aarch64/codegen.rs):
  - Added emit_initializer_data() to recursively emit complex initializers
  - Handles proper padding/gaps between array elements and struct fields
  - Emits correct size directives (.byte, .short, .long, .quad)

  5. Added tests (tests/globals/mod.rs):
  - test_global_array_init - basic array initializer
  - test_global_array_partial_init - partial array init (rest zeros)
  - test_global_array_designated - [index] = value syntax
  - test_global_struct_init - basic struct initializer
  - test_global_struct_designated - .field = value syntax
  - test_global_struct_partial_init - partial struct init
  - test_global_string_array - char array from string literal
  - test_static_local_array - static local array
  - test_static_local_struct - static local struct
  Changes Made

  1. Parser (parse/parser.rs)
  - Enhanced eval_const_expr to support:
    - sizeof(type) and sizeof(expr)
    - Cast expressions
    - All comparison operators (were already there)
    - Conditional expressions (were already there)
  - Fixed array size parsing (2 locations at lines ~3244 and ~3877):
    - Was: Only handled IntLit directly
    - Now: Uses eval_const_expr to handle full constant expressions like 2 +
   3, sizeof(int) * 2, X + Y (enum constants)

  2. Linearizer (ir/linearize.rs)
  - Enhanced eval_const_expr to match the parser's capabilities (sizeof,
  casts, conditionals, all operators)
  - Enhanced ast_init_to_ir to:
    - Handle enum constants in initializers
    - Use eval_const_expr as a fallback for any expression that might be
  constant (handles int x = 2 + 3;)

  3. Tests (tests/globals/mod.rs)
  - Added 3 new tests:
    - test_global_const_expr_init: Tests int x = 2 + 3;, 10 * 4 + 2, (1 <<
  4) | 10, etc.
    - test_array_const_expr_size: Tests int arr[2 + 3];, int arr[sizeof(int)
   * 2];
    - test_enum_const_expr: Tests using enum constants in array sizes and
  initializers

  4. Documentation
  - Removed "constant expression evaluation" from the "not yet implemented"
  list in README.md
@jgarzik jgarzik requested a review from Copilot December 8, 2025 11:47
@jgarzik jgarzik self-assigned this Dec 8, 2025
@jgarzik jgarzik added documentation Improvements or additions to documentation enhancement New feature or request cleanup labels Dec 8, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive support for C99/C11 features including the noreturn attribute, several GCC-style builtins (__builtin_ctz*, __builtin_unreachable), setjmp/longjmp non-local jumps, and complex global/static initializers (arrays, structs, designated initializers). The implementation includes type system updates, parser enhancements, IR extensions, architecture-specific code generation for both x86-64 and AArch64, dead code elimination optimizations, extensive test coverage, and detailed documentation.

Key Changes:

  • Added noreturn function attribute support (__attribute__((noreturn)) and _Noreturn keyword) with proper type system integration
  • Implemented count trailing zeros builtins (__builtin_ctz, __builtin_ctzl, __builtin_ctzll) with native instruction support (BSF on x86-64, RBIT+CLZ on AArch64)
  • Added __builtin_unreachable() for optimization hints with trap instructions (UD2/BRK) and DCE integration
  • Implemented setjmp/longjmp for non-local control flow with proper ABI-compliant code generation
  • Enhanced constant expression evaluation to support sizeof, casts, and complex expressions for array sizes and initializers
  • Added complex initializer support (arrays, structs, strings, designated initializers, symbol addresses) with proper zero-padding and offset handling

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
cc/types.rs Added NORETURN type modifier flag and noreturn field to Type struct for function types
cc/token/preprocess.rs Updated __has_attribute and __has_builtin to report support for noreturn and new builtins
cc/parse/parser.rs Added parsing for _Noreturn keyword, ctz/unreachable/setjmp/longjmp builtins, enhanced constant expression evaluation
cc/parse/ast.rs Added AST nodes for Ctz/Ctzl/Ctzll, Unreachable, Setjmp, and Longjmp expressions
cc/ir/mod.rs Added IR opcodes and initializer variants for new features
cc/ir/linearize.rs Implemented linearization for new builtins, enhanced global initializer handling, added inline function semantic checks
cc/ir/dce.rs Extended DCE to handle unreachable blocks and fold branches to unreachable
cc/arch/x86_64/lir.rs Added UD2 (trap) and BSF (bit scan forward) x86-64 instructions
cc/arch/x86_64/codegen.rs Implemented code generation for all new features with proper initializer emission
cc/arch/aarch64/lir.rs Added BRK (trap), RBIT, and CLZ AArch64 instructions
cc/arch/aarch64/codegen.rs Implemented AArch64 code generation for all new features
cc/arch/lir.rs Added .ascii directive and QuadSym for symbol address relocations
cc/tests/globals/mod.rs Added comprehensive tests for global/static array and struct initializers
cc/tests/features/*.rs Added test files for noreturn, unreachable, setjmp, inline, and ctz builtins
cc/doc/ATTR.md New documentation file for function attributes
cc/doc/BUILTIN.md Updated with documentation for new builtins
cc/doc/README.md New documentation index
cc/README.md Updated limitations list to reflect implemented features

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

and clippy warning fix
@jgarzik jgarzik merged commit 431dc7c into main Dec 8, 2025
4 checks passed
@jgarzik jgarzik deleted the cc branch December 8, 2025 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanup documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants