🤖 feat: interactive harness init approval flow #1807

ThomasK33 · 2026-01-20T22:23:56Z

Adds an interactive Harness-from-Plan workflow so Ralph loop runs with an explicit, user-reviewed harness.

What changed

Added a hidden harness-init agent (repo-aware, interactive harness authoring) with UI styling.
Restricted harness-init edits to .mux/harness/*.jsonc via tool-layer allowedEditPaths enforcement.
Restricted harness-init sub-agent spawning to read-only explore tasks.
Updated the Plan tool card’s Start Ralph loop button to switch to harness-init and request a harness proposal.
Added a propose_harness tool + UI card with Approve & Start to start the loop in Exec mode.

Validation

make static-check

📋 Implementation Plan

Interactive Harness-from-Plan (repo-aware + chat-first approval)

Goals

Let the Harness-from-Plan flow do repo-aware, read-only investigation (read files, rg, inspect CI config) before proposing checklist + gates.
Allow the agent to spawn read-only explore sub-agents to answer: “what parts of the repo are affected?” and “what gates/commands exist here?”
Make harness generation interactive (Plan-Mode-like):
- you converse with the agent, it updates the harness, and proposes again
- you manually approve the harness
- approval automatically starts the Ralph loop in the parent workspace

Recommended approach — True inline “Harness Mode” (hidden harness-init agent)

Net LoC (product code): ~400–650

1) Add a hidden harness-init agent (interactive, repo-aware)

Add a new builtin agent ID (e.g. harness-init) that is:
- ui.hidden: true (not selectable in the agent picker / command palette)
- not runnable as a subagent (do not set subagent.runnable: true; add a denylist check in the task tool for defense-in-depth)
UI indicator:
- Add --color-harness-init-mode (and optional hover/alpha variants) in src/browser/styles/globals.css.
- Update src/browser/components/AgentModePicker.tsx to show Harness Init in that color.
- Update src/browser/components/ChatInput/index.tsx so the Send button uses bg-harness-init-mode when the active agent is harness-init.
Update src/node/builtinAgents/harness-from-plan.md and the new harness-init prompt to share the same guidance:
- Repo investigation (read-only):
  - detect task runners + CI entrypoints (e.g., Makefile, justfile, package.json scripts, .github/workflows/*)
  - map plan/change-scope to impacted subsystems by tracing imports/callsites (avoid overfitting to a single file)
- Gate selection philosophy:
  - propose gates that balance coverage vs cost (correctness, robustness, time constraints)
  - include at least one broad-but-cheap gate that matches the repo’s tooling (typecheck/lint/build)
  - add targeted tests/gates for the impacted subsystems
  - if an expensive gate seems warranted, ask before choosing it and suggest cheaper alternatives

2) Constrain edits to harness files (but allow in-place diffs)

Allow file_edit_* tools for harness-init, but enforce a tool-layer allowlist so it can only edit .mux/harness/*.jsonc.
- Implementation: extend ToolConfiguration with allowedEditPaths and enforce it in src/node/services/tools/fileCommon.ts.
- Populate allowedEditPaths in src/node/services/aiService.ts when the active agent is harness-init.
Seed .mux/harness/<workspace>.jsonc when harness-init starts so the agent can make small diffs without re-outputting the whole file.
(Optional but recommended) In propose_harness, assert the working tree has no changes outside .mux/harness/*.jsonc to catch accidental edits (including via bash).

3) Enable explore subagents safely

Allow task + task_await for harness-init.
Tighten src/node/services/tools/task.ts so harness-init can only spawn agentId: "explore".

4) Wire “Start Ralph Loop” to switch agent + send an initiating message

In src/browser/components/tools/ProposePlanToolCall.tsx:
- Replace the direct api.workspace.loop.startFromPlan() call with a Plan-Mode-like transition:
  1. updatePersistedState(getAgentIdKey(workspaceId), "harness-init")
  2. api.workspace.sendMessage({ message: "Generate a Ralph harness from the current plan and propose it" })

5) Add `propose_harness` tool + approval UI (mirrors `propose_plan`)

Backend tool propose_harness:
- validate harness exists + is parseable
- call recordFileState so the UI can detect out-of-band edits
Frontend tool card (e.g. ProposeHarnessToolCall.tsx):
- fetch harness config (prefer api.workspace.harness.get if it already returns enough; otherwise add a dedicated getHarnessContent endpoint)
- show checklist + gates
- provide “Approve & Start Ralph Loop”:
  1. switch agent back to exec
  2. start loop using the existing harness (add a workspace.loop.start endpoint if one doesn’t already exist)

6) Tests

Unit tests:
- allowedEditPaths enforcement (can edit harness; cannot edit other files)
- harness gate allowlist + normalization stays stable
Minimal API test for propose_harness validation.

Alternative (less refactor)

Option — Dedicated “Harness Review” child workspace

Net LoC: ~300–500

Spawn a child harness-review workspace and make the user switch into it to answer questions.
Lower change surface, but worse UX (workspace switching) and harder to make it feel Plan-Mode-like.

Generated with mux • Model: openai:gpt-5.2 • Thinking: high • Cost: $60.54

Adds workspace-local harness config (checklist + gates) and an opt-in Ralph loop runner. - Backend services: WorkspaceHarnessService, GateRunnerService, GitCheckpointService, LoopRunnerService - ORPC: workspace.harness + workspace.loop endpoints - UI: RightSidebar Harness tab + command palette actions for gates/checkpoint/loop Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with • Model: openai:gpt-5.2 • Thinking: high • Cost: 0.17_ Change-Id: I99428a620b0bd65e9b9a2bb9023b9dd9e0843bc1

Change-Id: I15d81ab1136b5437df531ba6cb3e23cf84c321a0 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: Ide9e2ac1fa93252310350441843ae4d7eaa0ad25 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: I0f684cca69decbe2756577ec54c321ea0e13b182 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: Iebbcc21aaa8a919be5e1217c0d44b6cee070d782 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Include the workspace plan file path in harness reset/loop bearings summaries. Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.2` • Thinking: `xhigh` • Cost: $47.81_ Change-Id: I89cf61ac2e147042882b58297d0bf9dde49835fd

Change-Id: Icf5963d92a65300117de0c264272f8ca3952c4e0 Signed-off-by: Thomas Kosiewski <tk@coder.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 446b377437

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/node/services/loopRunnerService.ts

Change-Id: Ie569d9a08cf122c8d7dce626003d1620a6e37bf9 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: I88bf5879b908141790c6119d99f93983071a6b5e Signed-off-by: Thomas Kosiewski <tk@coder.com>

ThomasK33 · 2026-01-20T23:21:35Z

Follow-ups pushed:

Loop runner now re-reads the latest harness config when marking checklist items doing/done (avoids clobbering concurrent edits).
Updated sidebar layout + IPC tests to account for the Harness tab and to reduce Windows timing flakiness.

Note: Chromatic "UI Review" / "UI Tests" are still pending (require baseline acceptance).

ThomasK33 added 3 commits January 20, 2026 23:24

🤖 feat: start Ralph loop from plan

9c1aaab

Change-Id: I15d81ab1136b5437df531ba6cb3e23cf84c321a0 Signed-off-by: Thomas Kosiewski <tk@coder.com>

fix: move harness artifacts into .mux/harness

c5224c9

Change-Id: Ide9e2ac1fa93252310350441843ae4d7eaa0ad25 Signed-off-by: Thomas Kosiewski <tk@coder.com>

mintlify bot deployed to staging - docs January 20, 2026 22:24 View deployment

ThomasK33 added 4 commits January 20, 2026 23:25

🤖 feat: improve Start Ralph loop UX and harness gen

ba33659

Change-Id: I0f684cca69decbe2756577ec54c321ea0e13b182 Signed-off-by: Thomas Kosiewski <tk@coder.com>

🤖 fix: make harness progress file an append-only journal

152d878

Change-Id: Iebbcc21aaa8a919be5e1217c0d44b6cee070d782 Signed-off-by: Thomas Kosiewski <tk@coder.com>

🤖 feat: interactive harness init approval flow

43951c5

Change-Id: Icf5963d92a65300117de0c264272f8ca3952c4e0 Signed-off-by: Thomas Kosiewski <tk@coder.com>

ThomasK33 force-pushed the mux-agents-gmwq branch from 446b377 to 43951c5 Compare January 20, 2026 22:25

mintlify bot deployed to staging - docs January 20, 2026 22:26 View deployment

chatgpt-codex-connector bot reviewed Jan 20, 2026

View reviewed changes

src/node/services/loopRunnerService.ts Outdated Show resolved Hide resolved

ThomasK33 added 2 commits January 20, 2026 23:42

🤖 tests: stabilize sidebar + bash integration tests

7532d0a

Change-Id: Ie569d9a08cf122c8d7dce626003d1620a6e37bf9 Signed-off-by: Thomas Kosiewski <tk@coder.com>

🤖 fix: preserve harness edits when updating checklist status

7867a5e

Change-Id: I88bf5879b908141790c6119d99f93983071a6b5e Signed-off-by: Thomas Kosiewski <tk@coder.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🤖 feat: interactive harness init approval flow #1807

🤖 feat: interactive harness init approval flow #1807

ThomasK33 commented Jan 20, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

ThomasK33 commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

🤖 feat: interactive harness init approval flow #1807

Are you sure you want to change the base?

🤖 feat: interactive harness init approval flow #1807

Conversation

ThomasK33 commented Jan 20, 2026

What changed

Validation

Interactive Harness-from-Plan (repo-aware + chat-first approval)

Goals

Recommended approach — True inline “Harness Mode” (hidden harness-init agent)

1) Add a hidden harness-init agent (interactive, repo-aware)

2) Constrain edits to harness files (but allow in-place diffs)

3) Enable explore subagents safely

4) Wire “Start Ralph Loop” to switch agent + send an initiating message

5) Add propose_harness tool + approval UI (mirrors propose_plan)

6) Tests

Option — Dedicated “Harness Review” child workspace

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

5) Add `propose_harness` tool + approval UI (mirrors `propose_plan`)