Add script and workflow for benchmarking #317

DrXiao · 2026-01-18T07:35:23Z

The proposed changes aim to resolve #236 so that we can observe the performance of the bootstrapping when shecc's source code is changed.

Benchmarking script

First, a new Python script (tests/bench.py) is introduced to run the bootstrapping process and calculate the average execution time and max RSS. The usage of this new script is shown as follows:

usage: bench.py [-h] [--hostcc {cc,gcc,clang}] [--arch {arm,riscv}] [--dynlink] [--output-json OUTPUT_JSON] [--runs RUNS]

Run benchmarks for shecc

options:
  -h, --help            show this help message and exit
  --hostcc {cc,gcc,clang}
                        Host C Compiler (default: gcc)
  --arch {arm,riscv}    Target architecture (default: arm)
  --dynlink             Enable dynamic linking (default: static linking)
  --output-json OUTPUT_JSON
                        Output JSON file name (default: out/benchmark.json)
  --runs RUNS           Number of runs (default: 5)

$ ./tests/bench.py                      # default configuration          
$ ./tests/bench.py --hostcc clang       # Use clang as host C compiler
$ ./tests/bench.py --runs 10            # Repeat the bootstrapping for 10 runs
$ ./tests/bench.py --arch arm --dynlink # Arm + Dynamic linking

Combined with make:

$ make bench                        # default configuration
$ make bench CC=clang               # Use clang as host C compiler
$ make bench ARCH=arm DYNLINK=1     # Arm + Dynamic linking
$ make bench BENCH_RUNS=10          # 10 runs
$ make bench \
  BENCH_OUTPUT_JSON=out.json        # Output json filename

$ make all-bench
$ make all-bench CC=clang

New workflow

This new workflow is still a work in progress and will integrate GitHub Action for Continuous Benchmarking, allowing us to view the statistics in another GitHub repository, similar to rv32emu.

The current definitions can check whether shecc's source code has changed. If so, the workflow continues to execute the subsequent steps to install the dependencies and run the benchmarking script via make.

TODO:

Complete the workflow definitions and commit messages.
Add a new GitHub repository to store the benchmarking results. (e.g.: shecc-bench)
- This may require the maintainer of sysprog21 to create a repository.
- After the repository is created, another task will be initiated to add a webpage to visualize the results.

Summary by cubic

Adds a benchmarking script and a CI workflow to measure shecc bootstrapping performance (average time and max RSS) and spot regressions when core code changes. Addresses #236.

New Features
- Added tests/bench.py to run bootstrapping multiple times and output JSON. Supports --hostcc (cc/gcc/clang), --arch (arm/riscv), --dynlink, --runs, --output-json.
- Added Makefile targets: bench and all-bench, with BENCH_RUNS and BENCH_OUTPUT_JSON for simple config across architectures and link modes.
- Added GitHub Actions workflow that runs on push/PR/manual, skips merge commits, checks for core source changes, runs on ubuntu-24.04-arm, installs deps, and runs make bench with gcc (static and dynamic). Prints JSON for future continuous benchmarking.

^{Written for commit 96c3fd3. Summary will update on new commits.}

jserv · 2026-01-18T07:44:09Z

GitHub Actions provides standard GitHub-hosted runners for Ubuntu on Arm64. This makes it practical to evaluate Arm32 execution on Arm64 using QEMU user-mode emulation, which typically offers better performance than running Arm32 emulation on x86-64 hosts.

Since any source code changes may affect the performance of the bootstrapping process, this commit adds a new Python script to measure its performance. The Python script provides the following arguments: - Host C compiler (--hostcc) - Target architecture (--arch) - Enable dynamic linking (--dynlink) - Number of runs (--runs) - Output json file name (--output-json) After executing the script, it repeatedly runs the bootstrapping process for the compiler, calculates the average execution time and maximum memory usage during the runs, and then stores the results in the specified file while displaying them to the user via standard output. Additionally, Makefile also adds two new targets: - 'bench': run the benchmarking script using the given configuration. e.g.: $ make bench # default configuration $ make bench CC=clang # Use clang as host C compiler $ make bench ARCH=arm DYNLINK=1 # Arm + Dynamic linking $ make bench BENCH_RUNS=10 # 10 runs $ make bench \ BENCH_OUTPUT_JSON=out.json # Output json filename - 'all-bench': run the benchmarking script for the following configurations: - (ARCH, DYNLINK) = (arm, static) - (ARCH, DYNLINK) = (riscv, statis) - (ARCH, DYNLINK) = (arm, dynamic) CC can be optionally determined by the user to use GCC or Clang. e.g.: $ make all-bench $ make all-bench CC=clang

DrXiao · 2026-01-19T15:09:06Z

GitHub Actions provides standard GitHub-hosted runners for Ubuntu on Arm64.

The new workflow has been modified to use arm64 runners. Since no C source files were changed, the workflow does not execute the third and fourth steps.

Reference: result in my repository

DrXiao · 2026-01-19T15:53:50Z

Because GitHub does not provide RISC-V runners, we can only measure the bootstrapping performance of shecc targeting the RISC-V architecture by running RISC-V emulation on x86-64 hosts.

DrXiao · 2026-01-20T15:56:15Z

I created a new benchmark-test branch based on the benchmark branch to test all steps by removing the if conditions.
-> result in my repository

Run make bench CC=gcc
env printf "ARCH=arm" > .session.mk
==> config: (HOSTCC, ARCH, DYNLINK)=(gcc, arm, static)
==> runs: 5
==> output_json: out/benchmark-gcc-arm-static.json
Running (1/5)...
Running (2/5)...
Running (3/5)...
Running (4/5)...
Running (5/5)...

======================================================================
Benchmark results
Config     : (HOSTCC, ARCH, DYNLINK)=(gcc, arm, static)
Output file: out/benchmark-gcc-arm-static.json
======================================================================
    Average Execution Time        : 6.4354237546 second (5 runs)
    Maximum Resident Set Size     : 346512 KBytes (5 runs)
======================================================================
env printf "ARCH=arm" > .session.mk
==> config: (HOSTCC, ARCH, DYNLINK)=(gcc, arm, dynamic)
==> runs: 5
==> output_json: out/benchmark-gcc-arm-dynamic.json
Running (1/5)...
Running (2/5)...
Running (3/5)...
Running (4/5)...
Running (5/5)...

======================================================================
Benchmark results
Config     : (HOSTCC, ARCH, DYNLINK)=(gcc, arm, dynamic)
Output file: out/benchmark-gcc-arm-dynamic.json
======================================================================
    Average Execution Time        : 4.7024199772000035 second (5 runs)
    Maximum Resident Set Size     : 333264 KBytes (5 runs)
======================================================================

We can observe that the workflow uses Arm64 runners together with QEMU to execute bench.py successfully. The execution results are shown as above.

jserv · 2026-01-21T04:14:37Z

We can observe that the workflow uses Arm64 runners together with QEMU to execute bench.py successfully.

Would it be worthwhile to switch to Arm64-based runners? If so, submit a pull request to implement the transition.

DrXiao force-pushed the benchmark branch 3 times, most recently from 32c4e18 to 6e0b0ba Compare January 19, 2026 14:19

DrXiao force-pushed the benchmark branch 2 times, most recently from 744e3c7 to 9beb7b1 Compare January 19, 2026 14:50

[WIP] Add a new workflow for benchmarking

96c3fd3

DrXiao force-pushed the benchmark branch from 9beb7b1 to 96c3fd3 Compare January 20, 2026 15:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add script and workflow for benchmarking #317

Add script and workflow for benchmarking #317

Uh oh!

DrXiao commented Jan 18, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

jserv commented Jan 18, 2026

Uh oh!

DrXiao commented Jan 19, 2026

Uh oh!

DrXiao commented Jan 19, 2026

Uh oh!

DrXiao commented Jan 20, 2026

Uh oh!

jserv commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add script and workflow for benchmarking #317

Are you sure you want to change the base?

Add script and workflow for benchmarking #317

Uh oh!

Conversation

DrXiao commented Jan 18, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarking script

New workflow

Summary by cubic

Uh oh!

jserv commented Jan 18, 2026

Uh oh!

DrXiao commented Jan 19, 2026

Uh oh!

DrXiao commented Jan 19, 2026

Uh oh!

DrXiao commented Jan 20, 2026

Uh oh!

jserv commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DrXiao commented Jan 18, 2026 •

edited by cubic-dev-ai bot

Loading