Skip to content

Fix try catch tests#15

Merged
cs01 merged 6 commits intomainfrom
fix-try-catch-tests
Feb 19, 2026
Merged

Fix try catch tests#15
cs01 merged 6 commits intomainfrom
fix-try-catch-tests

Conversation

@cs01
Copy link
Owner

@cs01 cs01 commented Feb 19, 2026

Fix try/catch tests, JSON.parse on macOS, and require native compiler in test suite

Summary

  • Flatten TryStatement AST node back to direct fields (catchParam, catchBody) instead of nested catchClause: { param, body } — fixes a self-hosting bug where the native compiler couldn't handle chained inline struct field access (tryStmt.catchClause.body), causing catch block bodies to be silently dropped from generated IR
  • Eliminate array-of-objects pattern and Set<string> / string += accumulation in json.ts JSON.parse codegen — fixes a self-hosting crash on macOS ARM64 where the native compiler segfaulted when compiling JSON.parse<T>() calls
  • Make npm test build .build/chadc automatically if missing, and fail loudly instead of silently falling back to the Node.js interpreter
  • Run compiler tests with both native and Node.js compilers to catch self-hosting bugs early
  • Restructure CI to build native compiler before tests on all platforms (consistent test setup)

Problem

  1. Try/catch self-hosting bug: Commit 5271d53 changed TryStatement from flat fields to a nested struct, relying on inline struct field access working in the native compiler. The two-level chained access tryStmt.catchClause.body silently produced no code in the native binary, so catch blocks were completely empty in the generated LLVM IR.

  2. JSON.parse crash on macOS: json.ts used multiple patterns that the native compiler can't handle reliably on ARM64:

    • JsonInterfaceDef containing { name: string; type: string }[] (array-of-objects) — undefined behavior on ARM64 due to strict alignment
    • Set<string> for dedup tracking — Set internals may have alignment/hashing issues in the native compiler on ARM64
    • let parserIR = ...; parserIR += ...; parserIR += ...; (30+ string += accumulations) — creates many intermediate heap-allocated C strings; the only codegen path that built strings this way instead of using ctx.emit() line-by-line
    • name.endsWith("?") / name.slice() — higher-level string methods that may not be fully reliable in the native compiler
  3. Silent test fallback: compiler.test.ts fell back to node dist/chadc-node.js if .build/chadc was missing. CI never had the native binary, so it always tested with the Node.js interpreter, masking both bugs above.

  4. Inconsistent CI setup: Linux glibc CI ran tests before building the native compiler or tree-sitter objects, so it never tested with the native compiler even after fixing the fallback. Only macOS and musl had the native compiler available during tests.

Changes

  • src/ast/types.tsTryStatement uses catchParam: string | null + catchBody: BlockStatement | null instead of catchClause: { param, body } | null
  • src/codegen/statements/control-flow.ts — access flat fields in codegen
  • src/parser-native/transformer.ts — produce flat fields from tree-sitter parse
  • src/parser-ts/handlers/statements.ts — produce flat fields from TS API parse
  • src/analysis/semantic-analyzer.ts — access flat fields
  • src/ast/visitor.ts — access flat fields
  • src/codegen/infrastructure/closure-analyzer.ts — access flat fields
  • src/codegen/stdlib/json.ts — remove JsonInterfaceDef and getInterfaceFields(), use delegate methods directly; replace Set<string> with string[] + manual lookup; replace string += accumulation with string[] lines pushed individually via pushGlobalString; replace endsWith/slice with charAt/substring
  • scripts/test.js — build native compiler before running tests; re-run compiler tests with Node.js compiler as second pass
  • tests/compiler.test.ts — configurable via CHADC_COMPILER env var, defaults to .build/chadc
  • .github/workflows/ci.yml — all 3 platforms now build tree-sitter objects + native compiler + smoke test before running tests; macOS signs compiler before tests

Test plan

  • npm test — 299 native + 161 node passing
  • npm run verify:quick — self-hosting passes
  • All 3 try/catch fixtures produce correct output with native compiler
  • All 3 JSON.parse fixtures compile and run correctly with native compiler
  • JSON.stringify still works (no regression)
  • Missing .build/chadc now throws a clear error instead of silent fallback
  • Both native and Node.js compiler passes run in npm test

Follow-up: root causes and prevention

These are underlying issues in the native compiler that caused the regressions. Fixing them would prevent future self-hosting bugs from slipping in.

Root cause: string += is unsafe on ARM64

  • The native compiler implements string += by allocating a new C string, copying both halves, and discarding the old one. On ARM64, this crashes after many iterations — likely an alignment or allocator bug in how GC_malloc_atomic interacts with strlen/memcpy for large accumulated strings.
  • TODO: Write a stress-test fixture that does 50+ string += in a loop, verify it passes on both Linux and macOS native compilers. If it crashes, the bug is in the string concatenation codegen itself (src/codegen/types/collections/string/). Fix the underlying IR generation for string concat.

Root cause: Set may not work correctly on ARM64

  • Set<string> uses a hash table internally. If the hash function or bucket storage has alignment issues in the compiled output, .has() / .add() could return wrong values or crash.
  • TODO: Write a test fixture that creates a Set<string>, adds 10+ entries, and checks .has() / .size. Run with native compiler on macOS to verify.

Root cause: array-of-objects field access (arr[i].name) is broken in stage 0

  • Already documented in .claude/rules.md but there's no automated enforcement.
  • TODO: Add a self-hosting lint pass (or a grep-based CI check) that scans src/codegen/ for patterns like [i].fieldName and warns. Alternatively, fix the underlying GEP codegen for array-of-struct element access so arr[i].field works correctly.

Root cause: chained inline struct field access (a.b.c) silently drops code

  • The native compiler handles a.b fine but a.b.c (two levels of inline struct GEP) silently produces no code.
  • TODO: Fix loadFieldValue in src/codegen/expressions/member.ts to handle chained struct access. Add a test fixture with 2-level and 3-level chained struct access.

Root cause: tests didn't catch self-hosting bugs

  • CI tested with the Node.js interpreter, which runs TypeScript directly and never hits native compiler bugs. Fixed in this PR by requiring the native compiler and running tests with both.
  • TODO: Consider adding a CI job that runs the full self-hosting chain (npm run verify) on all platforms, not just macOS/musl. Currently only build-linux-musl and build-macos run self-hosting.test.ts.

@cs01 cs01 merged commit e12983a into main Feb 19, 2026
13 checks passed
@cs01 cs01 deleted the fix-try-catch-tests branch February 19, 2026 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments