Skip to content

feat: add benchmark automation bot [WIP]#3557

Draft
andygrove wants to merge 2 commits intoapache:mainfrom
andygrove:benchmark-bot
Draft

feat: add benchmark automation bot [WIP]#3557
andygrove wants to merge 2 commits intoapache:mainfrom
andygrove:benchmark-bot

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Feb 20, 2026

Closes #3556

Summary

I have been testing a version of this code for several weeks and it seems to work fairly well now, so I would like to get the code into OSS for transparency, and allow others to help make improvements.

It is just a first step. The benchmarks do run in k8s in a constrained environment, which is good, but the tests run in Spark local mode. It would be better to deploy as a real cluster in k8s later on.

There is currently an assumption that TPC-H 100GB data already exists on the k8s nodes. It would be better to generate the data using tpchgen-cli directly in the containers. It would also be nice to support different scale factors.

There are likely many other improvements that can be made in the future.

Changes

  • Adds a GitHub bot (cometbot) that monitors PR comments for slash commands (/run tpch, /run micro, /help) and automatically runs benchmarks in Kubernetes, posting results back as PR comments
  • Includes a Click CLI for manual benchmark runs, Docker image build/push, K8s job management, and deployment tooling
  • Adds contributor guide documentation explaining how to trigger benchmarks and how the bot works

Details

The bot lives in dev/benchmarking-bot/ and includes:

  • Bot (src/cometbot/bot.py): Polls GitHub for slash commands on open Comet PRs
  • K8s (src/cometbot/k8s.py): Builds Docker images, creates/manages Kubernetes jobs
  • CLI (src/cometbot/cli.py): Click-based CLI for manual benchmark runs and bot management
  • Dockerfile: Container with JDK 17, Rust, Maven, and Spark 3.5 for running benchmarks
  • K8s templates (k8s/): Job and deployment manifests
  • Deploy script (deploy/deploy.sh): Automated deployment to a remote host via SSH

All configuration is via COMETBOT_* environment variables (registry, GitHub token, deploy host, etc.).

@andygrove andygrove changed the title feat: add benchmark automation bot feat: add benchmark automation bot [WIP] Feb 20, 2026
@andygrove andygrove force-pushed the benchmark-bot branch 2 times, most recently from 821b893 to 360a448 Compare February 20, 2026 17:37
…rks on PRs

Adds a GitHub bot that monitors PR comments for slash commands (/run tpch,
/run micro, /help) and automatically runs benchmarks in Kubernetes, posting
results back as PR comments. Includes CLI for manual benchmark runs, Docker
image build/push, K8s job management, and deployment tooling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@andygrove andygrove force-pushed the benchmark-bot branch 2 times, most recently from 9c00b57 to d43d71c Compare February 20, 2026 17:43
@andygrove
Copy link
Member Author

@Shekharrajak fyi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add benchmark automation code to the repo

1 participant

Comments