Commit ca2367a
authored
🤖 feat: sub-workspaces as subagents (#1219)
Implements “sub-workspaces as subagents” by introducing agent Tasks
backed by child workspaces spawned via the new `task` tool.
- Built-in presets: `research`, `explore`
- Config + UI for max parallel tasks / nesting depth
- Restart-safe orchestration (queueing, report delivery to parent,
auto-resume)
- Explicit reporting via `agent_report` + leaf auto-cleanup
- Sidebar nesting for child workspaces
Validation:
- `make static-check`
- `bun test src/node/services/tools/task.test.ts
src/node/services/taskService.test.ts`
---
<details>
<summary>📋 Implementation Plan</summary>
# 🤖 Sub-workspaces as subagents (Mux)
## Decisions (confirmed)
- **Lifecycle:** auto-delete the subagent workspace after it completes
(and after its child tasks complete).
- **Isolation (runtime-aware):** create subagent workspaces using the
**parent workspace’s runtime** (`runtimeConfig`); prefer
`runtime.forkWorkspace(...)` (when implemented) so the child starts from
the parent’s branch.
- **Results:** when the subagent finishes, it calls `agent_report` and
we post the report back into the parent workspace.
- **Limits (configurable):** max parallel subagents + max nesting depth
(defaults: 3 parallel, depth 3).
- **Durability:** if Mux restarts while tasks are running, tasks resume
and the parent awaits existing tasks (no duplicate spawns).
- **Delegation:** expose a `task` tool so any agent workspace can spawn
sub-agent tasks (depth-limited).
- **Built-in presets (v1):** **Research** + **Explore**.
## Recommended approach: Workspace Tasks *(net +~1700 LoC product code)*
Represent each subagent as a **Task** (as described in `subagents.md`),
implemented as a **child workspace** plus orchestration.
This keeps the v1 scope small while keeping the API surface
*task-shaped* so we can later reuse it for non-agent tasks (e.g.,
background bashes).
### High-level architecture
```mermaid
flowchart TD
Parent["Parent workspace"]
TaskTool["tool: task"]
Spawn["Task.create(parentId, kind=agent, agentType, prompt)"]
Child["Child workspace (agent)"]
ReportTool["tool-call-end: agent_report"]
Report["Append report message to parent history + emit chat event"]
Cleanup["Remove child workspace + delete runtime resources"]
StreamEndNoReport["stream-end (no report)"]
Reminder["Send reminder: toolPolicy requires agent_report"]
Parent --> TaskTool --> Spawn --> Child
Child --> ReportTool --> Report --> Cleanup
Child --> StreamEndNoReport --> Reminder --> Child
```
### Data model
<details>
<summary>Alignment with <code>subagents.md</code> (what we’re
matching)</summary>
- **Agent identity**: Claude’s `agentId` maps cleanly to Mux’s
`workspaceId` for the child workspace.
- **Spawning**: Claude’s `Task(subagent_type=…, prompt=…)` becomes Mux
tool `task`, backed by `Task.create({ parentWorkspaceId, kind: "agent",
agentType, prompt })`.
- **Tool filtering**: Claude’s `tools`/`disallowedTools` maps to Mux’s
existing `toolPolicy` (applied in order).
- **Result propagation**: Agent tasks use an explicit `agent_report`
tool call (child → parent) plus a backend retry if the tool wasn’t
called. (Future: bash tasks can map to existing background bash output,
or be unified behind a `Task.output` API.)
- **Background vs foreground**: `task({ run_in_background: true, ... })`
returns immediately; otherwise the tool blocks until the child calls
`agent_report` (with a timeout).
</details>
Extend workspace metadata with optional fields:
- `parentWorkspaceId?: string` — enables nesting in the UI
- `agentType?: "research" | "explore" | string` — selects an agent
preset
(These are optional so existing configs stay valid.)
### Agent presets (built-in)
Create a small registry of agent presets that define:
- `toolPolicy` (enforced)
- `systemPrompt` (preset-defined; can **replace** or append; v1 uses
**replace** so each subagent can fully override the parent’s user
instructions)
Implementation detail: for agent task workspaces, treat the preset’s
`systemPrompt` as the **effective** prompt (internal mode), instead of
always appending to the parent workspace’s system message.
- A required reporting mechanism: the agent must call `agent_report`
exactly once when it has a final answer
Initial presets:
- **Research**: allow `web_search` + `web_fetch` (and optionally
`file_read`), disallow edits.
- **Explore**: allow *read-only* repo exploration (likely `file_read` +
`bash` for `rg`/`git`), disallow file edits.
Both presets should **enable**:
- `task` (so agents can spawn subagents when useful)
- `agent_report` (so leaf tasks have a single, explicit channel for
reporting back)
Enforce max nesting depth from settings (default 3) in the backend to
prevent runaway recursion.
> Note: Mux doesn’t currently have a “grep/glob” tool; Explore will
either need `bash` or we add a future safe-search tool.
---
## Implementation steps
### 1) Schemas + types (IPC boundary)
**Net +~50 LoC**
- Extend:
- `WorkspaceMetadataSchema` / `FrontendWorkspaceMetadataSchema`
(`src/common/orpc/schemas/workspace.ts`)
- `WorkspaceConfigSchema` (`src/common/orpc/schemas/project.ts`)
- Thread the new fields through `WorkspaceMetadata` /
`FrontendWorkspaceMetadata` types.
### 2) Persist config (workspace tree + task settings)
**Net +~320 LoC**
- Workspace tree fields
- Ensure config write paths preserve `parentWorkspaceId` and
`agentType`.
- Update `Config.getAllWorkspaceMetadata()` to include the new fields
when constructing metadata.
- Task settings (global; shown in Settings UI)
- Persist `taskSettings` in `~/.mux/config.json`, e.g.:
- `maxParallelAgentTasks` (default 3)
- `maxTaskNestingDepth` (default 3)
- Settings UI
- Add a small Settings section (e.g. “Tasks”) with two number inputs.
- Read via `api.config.getConfig()`; persist via
`api.config.saveConfig()`.
- Clamp to safe ranges (e.g., parallel 1–10, depth 1–5) and show the
defaults.
- Task durability fields (per agent task workspace)
- Persist a minimal task state for child workspaces (e.g., `taskStatus:
queued|running|awaiting_report`) so we can rehydrate and resume after
restart.
### 3) Backend Task API: Task.create
**Net +~450 LoC**
Add a new **task** operation (ORPC + service) that is intentionally
generic:
- `Task.create({ parentWorkspaceId, kind, ... })`
- Return a task-shaped result: `{ taskId, kind, status }`.
V1: implement `kind: "agent"` (sub-workspace agent task):
1. Validate parent workspace exists.
2. Enforce limits from `taskSettings` (configurable):
- Max nesting depth (`maxTaskNestingDepth`, default 3) by walking the
`parentWorkspaceId` chain.
- Max parallel agent tasks (`maxParallelAgentTasks`, default 3) by
counting running agent tasks globally (across the app).
- If parallel limit is reached: persist as `status: "queued"` and start
later (FIFO).
3. Create a new child workspace ID + generated name (e.g.,
`agent_research_<id>`; must match `[a-z0-9_-]{1,64}`).
4. **Runtime-aware:** create the child workspace using the parent
workspace’s `runtimeConfig` (Local/Worktree/SSH).
- Prefer `runtime.forkWorkspace(...)` (when implemented) so the child
starts from the parent’s branch.
- Otherwise fall back to `runtime.createWorkspace(...)` with the same
runtime config (no branch isolation).
5. Write workspace config entry including `{ parentWorkspaceId,
agentType, taskStatus }`.
6. When the task is started, send the initial prompt message into the
child workspace.
Durability / restart:
- On app startup, rehydrate queued/running tasks from config and resume
them:
- queued tasks are scheduled respecting `maxParallelAgentTasks`
- running tasks get a synthetic “Mux restarted; continue + call
agent_report” message.
- Parent await semantics (restart-safe):
- While a parent workspace has any descendant agent tasks in
`queued|running|awaiting_report`, treat it as “awaiting” and avoid
starting new subagent tasks from it.
- When the final descendant task reports, automatically resume any
parent partial stream that was waiting on the `task` tool call.
Design note: keep the return type “task-shaped” (e.g., `{ taskId, kind,
status }`) so we can later add `kind: "bash"` tasks that wrap existing
background bashes.
### 4) Tool: `task` (agents can spawn sub-agents)
**Net +~250 LoC**
Expose a Claude-like `Task` tool to the LLM (but backed by Mux
workspaces):
- Tool: `task`
- Input (v1): `{ subagent_type: string, prompt: string, description?:
string, run_in_background?: boolean }`
- Behavior:
- Spawn (or enqueue) a child agent task via `Task.create({
parentWorkspaceId: <current workspaceId>, kind: "agent", agentType:
subagent_type, prompt, ... })`.
- If `run_in_background` is true: return immediately `{ status: "queued"
| "running", taskId }`.
- Otherwise: block (potentially across queue + execution) until the
child calls `agent_report` (or timeout) and return `{ status:
"completed", reportMarkdown }`.
- Durability: if this foreground wait is interrupted (app restart), the
child task continues; when it reports, we persist the tool output into
the parent message and auto-resume the parent stream.
- Wire-up: add to `TOOL_DEFINITIONS` + register in `getToolsForModel()`;
inject `taskService` into ToolConfiguration so the tool can call
`Task.create`.
- Guardrails
- Enforce `maxTaskNestingDepth` and `maxParallelAgentTasks` from
settings (defaults: depth 3, parallel 3).
- If parallel limit is reached, new tasks are queued and the parent
blocks/awaits until a slot is available.
- Disallow spawning new tasks after the workspace has called
`agent_report`.
### 5) Enforce preset tool policy + system prompt
**Net +~130 LoC**
In the backend send/stream path:
- Compute an effective tool policy:
- `effectivePolicy = [...(options.toolPolicy ?? []), ...presetPolicy]`
- Apply *presetPolicy last* so callers cannot re-enable restricted
tools.
- System prompt strategy for agent task workspaces (per preset):
- **Replace (default):** ignore the parent workspace’s user instructions
and use the preset’s `systemPrompt` as the effective instructions
(internal-only agent mode).
- Implementation: add an internal system-message variant (e.g.,
`"agent"`) that starts from an empty base prompt (no user custom
instructions), then apply `preset.systemPrompt`.
- **Append (optional):** keep the normal workspace system message and
append preset instructions.
- Ensure the preset prompt covers:
- When/how to delegate via the `task` tool (available `subagent_type`s).
- When/how to call `agent_report` (final answer only; after any spawned
tasks complete).
### 6) Auto-report back + auto-delete (orchestrator)
**Net +~450 LoC**
Add a small reporting tool + orchestrator that ensures the child reports
back explicitly, and make it durable across restarts.
- Tool: `agent_report`
- Input: `{ reportMarkdown: string, title?: string }` (or similar)
- Execution: no side effects; return `{ success: true }` (the backend
uses the tool-call args as the report payload)
- Wire-up: add to `TOOL_DEFINITIONS` + register in `getToolsForModel()`
as a non-runtime tool
- Orchestrator behavior
- Primary path: handle `tool-call-end` for `agent_report`
1. Validate `workspaceId` is an agent task workspace and has
`parentWorkspaceId`.
2. Persist completion (durable):
- Update child workspace config: `taskStatus: "reported"` (+
`reportedAt`).
3. Deliver report to the parent (durable):
- Append an assistant message to the parent workspace history (so the
user can read the report).
- If the parent has a partial assistant message containing a pending
`task` tool call, update that tool part from `input-available` →
`output-available` with `{ reportMarkdown, title }` (like the
`ask_user_question` restart-safe fallback).
- Emit `tool-call-end` + `workspace.onChat` events so the UI updates
immediately.
4. Auto-resume the parent (durable tool call semantics):
- If the parent has a partial message and **no active stream**, call
`workspace.resumeStream(parent)` after writing the tool output.
- Only auto-resume once the parent has **no remaining running descendant
tasks** (so it doesn’t spawn duplicates).
5. Cleanup:
- If the task has no remaining child tasks, delete the workspace +
runtime resources (branch/worktree if applicable).
- Otherwise, mark it pending cleanup and delete it once its subtree is
gone.
- Enforcement path: if a stream ends without an `agent_report` call
1. Send a synthetic "please report now" message into the child workspace
with a toolPolicy that **requires only** `agent_report`.
2. If still missing after one retry, fall back to posting the child's
final text parts (last resort) and clean up to avoid hanging
sub-workspaces.
### 7) UI: nested sidebar rows
**Net +~100 LoC**
- Update sorting/rendering so child workspaces appear directly below the
parent with indentation.
- Add a small `depth` prop to `WorkspaceListItem` and adjust left
padding.
### 8) No user-facing launcher (agent-orchestrated only)
**Net +~0 LoC**
- Do **not** add slash commands / command palette actions for spawning
tasks.
- Tasks are launched exclusively via the model calling the `task` tool
from the parent workspace.
### 9) Tests
**~200 LoC tests (not counted in product LoC estimate)**
- Unit test: workspace tree flattening preserves parent→child adjacency.
- Unit/integration test: `task` tool spawns/enqueues a child agent task
and enforces `maxTaskNestingDepth`.
- Unit/integration test: queueing respects `maxParallelAgentTasks`
(extra tasks stay queued until a slot frees).
- Unit/integration test: `agent_report` posts report to parent, updates
waiting `task` tool output (restart-safe), and triggers cleanup (and
reminder path when missing).
- Unit test: toolPolicy merge guarantees presets can’t be overridden.
<details>
<summary>Follow-ups (explicitly out of scope for v1)</summary>
- More presets (Review, Writer). “Writer” likely needs
**non-auto-delete** so the branch/diff persists.
- `Task.create(kind: "bash")` tasks that wrap existing background bashes
(and optionally render under the parent like agent tasks).
- Safe “code search” tools (Glob/Grep) to avoid granting `bash` to
Explore.
- Deeper nesting UX (collapse/expand, depth cap visuals).
</details>
</details>
---
_Generated with `codex cli` • Model: `gpt-5.2` • Thinking: `xhigh`_
<!-- mux-attribution: model=gpt-5.2 thinking=xhigh -->
---------
Signed-off-by: Thomas Kosiewski <tk@coder.com>1 parent 32a9d27 commit ca2367a
File tree
76 files changed
+7103
-85
lines changed- .storybook/mocks
- src
- browser
- components
- ChatInput
- Settings
- sections
- contexts
- hooks
- stories
- utils
- messages
- ui
- cli
- common
- constants
- orpc
- schemas
- types
- utils
- ai
- thinking
- tools
- ui
- desktop
- node
- orpc
- runtime
- services
- tools
- tests/ipc
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
76 files changed
+7103
-85
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
16 | 23 | | |
17 | 24 | | |
18 | 25 | | |
| |||
46 | 53 | | |
47 | 54 | | |
48 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
49 | 60 | | |
50 | 61 | | |
51 | 62 | | |
| |||
123 | 134 | | |
124 | 135 | | |
125 | 136 | | |
| 137 | + | |
| 138 | + | |
126 | 139 | | |
127 | 140 | | |
128 | 141 | | |
| |||
140 | 153 | | |
141 | 154 | | |
142 | 155 | | |
| 156 | + | |
| 157 | + | |
143 | 158 | | |
144 | 159 | | |
145 | 160 | | |
| |||
172 | 187 | | |
173 | 188 | | |
174 | 189 | | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
175 | 200 | | |
176 | 201 | | |
177 | 202 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
| 76 | + | |
76 | 77 | | |
77 | 78 | | |
78 | 79 | | |
| |||
99 | 100 | | |
100 | 101 | | |
101 | 102 | | |
| 103 | + | |
102 | 104 | | |
103 | 105 | | |
104 | 106 | | |
| |||
134 | 136 | | |
135 | 137 | | |
136 | 138 | | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
137 | 147 | | |
138 | 148 | | |
139 | 149 | | |
| |||
727 | 737 | | |
728 | 738 | | |
729 | 739 | | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
730 | 748 | | |
731 | 749 | | |
732 | 750 | | |
| |||
768 | 786 | | |
769 | 787 | | |
770 | 788 | | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
771 | 795 | | |
772 | 796 | | |
773 | 797 | | |
774 | 798 | | |
775 | 799 | | |
776 | 800 | | |
777 | 801 | | |
778 | | - | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
779 | 808 | | |
780 | 809 | | |
781 | 810 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
1484 | 1484 | | |
1485 | 1485 | | |
1486 | 1486 | | |
| 1487 | + | |
| 1488 | + | |
| 1489 | + | |
| 1490 | + | |
| 1491 | + | |
| 1492 | + | |
1487 | 1493 | | |
1488 | 1494 | | |
1489 | 1495 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
| 32 | + | |
31 | 33 | | |
32 | 34 | | |
33 | 35 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
23 | 25 | | |
24 | 26 | | |
25 | 27 | | |
| |||
39 | 41 | | |
40 | 42 | | |
41 | 43 | | |
| 44 | + | |
| 45 | + | |
42 | 46 | | |
43 | 47 | | |
44 | 48 | | |
| |||
229 | 233 | | |
230 | 234 | | |
231 | 235 | | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
232 | 249 | | |
233 | 250 | | |
234 | 251 | | |
| |||
276 | 293 | | |
277 | 294 | | |
278 | 295 | | |
279 | | - | |
| 296 | + | |
280 | 297 | | |
281 | 298 | | |
282 | 299 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| |||
609 | 610 | | |
610 | 611 | | |
611 | 612 | | |
| 613 | + | |
612 | 614 | | |
613 | 615 | | |
614 | 616 | | |
| |||
626 | 628 | | |
627 | 629 | | |
628 | 630 | | |
| 631 | + | |
629 | 632 | | |
630 | 633 | | |
631 | 634 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
| |||
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
20 | 27 | | |
21 | 28 | | |
22 | 29 | | |
| |||
0 commit comments