feat: add benchmark automation bot [WIP] by andygrove · Pull Request #3557 · apache/datafusion-comet

andygrove · 2026-02-20T17:35:20Z

Summary

I have been testing a version of this code for several weeks and it seems to work fairly well now, so I would like to get the code into OSS for transparency, and allow others to help make improvements.

It is just a first step. The benchmarks do run in k8s in a constrained environment, which is good, but the tests run in Spark local mode. It would be better to deploy as a real cluster in k8s later on.

There is currently an assumption that TPC-H 100GB data already exists on the k8s nodes. It would be better to generate the data using tpchgen-cli directly in the containers. It would also be nice to support different scale factors.

There are likely many other improvements that can be made in the future.

Changes

Adds a GitHub bot (cometbot) that monitors PR comments for slash commands (/run tpch, /run micro, /help) and automatically runs benchmarks in Kubernetes, posting results back as PR comments
Includes a Click CLI for manual benchmark runs, Docker image build/push, K8s job management, and deployment tooling
Adds contributor guide documentation explaining how to trigger benchmarks and how the bot works

Details

The bot lives in dev/benchmarking-bot/ and includes:

Bot (src/cometbot/bot.py): Polls GitHub for slash commands on open Comet PRs
K8s (src/cometbot/k8s.py): Builds Docker images, creates/manages Kubernetes jobs
CLI (src/cometbot/cli.py): Click-based CLI for manual benchmark runs and bot management
Dockerfile: Container with JDK 17, Rust, Maven, and Spark 3.5 for running benchmarks
K8s templates (k8s/): Job and deployment manifests
Deploy script (deploy/deploy.sh): Automated deployment to a remote host via SSH

All configuration is via COMETBOT_* environment variables (registry, GitHub token, deploy host, etc.).

…rks on PRs Adds a GitHub bot that monitors PR comments for slash commands (/run tpch, /run micro, /help) and automatically runs benchmarks in Kubernetes, posting results back as PR comments. Includes CLI for manual benchmark runs, Docker image build/push, K8s job management, and deployment tooling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

andygrove · 2026-02-20T17:55:56Z

@Shekharrajak fyi

andygrove changed the title ~~feat: add benchmark automation bot~~ feat: add benchmark automation bot [WIP] Feb 20, 2026

andygrove force-pushed the benchmark-bot branch 2 times, most recently from 821b893 to 360a448 Compare February 20, 2026 17:37

andygrove force-pushed the benchmark-bot branch 2 times, most recently from 9c00b57 to d43d71c Compare February 20, 2026 17:43

reduce authorized users

51b5008

andygrove force-pushed the benchmark-bot branch from d43d71c to 51b5008 Compare February 20, 2026 17:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add benchmark automation bot [WIP]#3557

feat: add benchmark automation bot [WIP]#3557
andygrove wants to merge 2 commits intoapache:mainfrom
andygrove:benchmark-bot

andygrove commented Feb 20, 2026 •

edited

Loading

Uh oh!

andygrove commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

andygrove commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Details

Uh oh!

andygrove commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

andygrove commented Feb 20, 2026 •

edited

Loading