EntityProcess · christso · Jan 9, 2026 · Jan 9, 2026
diff --git a/openspec/changes/add-opencode-log-streaming/design.md b/openspec/changes/add-opencode-log-streaming/design.md
@@ -0,0 +1,117 @@
+## Context
+AgentV currently supports several “agentic” providers (e.g. Codex, Pi coding agent, VS Code subagent) that can execute multi-step work with tool calls.
+
+OpenCode is an agent runtime that exposes a local HTTP API plus Server-Sent Events (SSE) for streaming events. It also has a first-party TypeScript SDK (`@opencode-ai/sdk`) that can spawn a local `opencode serve` process and provides typed client methods.
+
+This change expands the existing OpenSpec proposal from “OpenCode stream logs” to a full OpenCode provider integration for AgentV.
+
+## Goals / Non-Goals
+
+Goals:
+- Add a new AgentV provider kind: `opencode`.
+- Support running OpenCode against a per-eval-case working directory (AgentV temp workspace) so the agent can read/write files.
+- Produce a `ProviderResponse.outputMessages` trace that captures:
+  - The final assistant message text.
+  - Tool calls (name + input + output) in a deterministic shape suitable for trace-based evaluators like `tool_trajectory`.
+- Provide optional per-run streaming log artifacts on disk and publish log paths so the CLI can show them early (Codex/Pi pattern).
+
+Non-Goals:
+- Full UI/interactive experiences (OpenCode TUI, rich streaming token output in AgentV terminal).
+- Implementing every OpenCode event type as a first-class AgentV trace event.
+- Distributed / remote OpenCode deployments that require auth beyond local process execution.
+
+## Key Decisions
+
+- **Use OpenCode’s first-party SDK v2** (`@opencode-ai/sdk/v2`) rather than implementing a custom HTTP + SSE client.
+  - Rationale: typed API surface, server lifecycle helper, fewer protocol footguns.
+
+- **Primary completion signal:** use `client.session.prompt(...)` to run the request and treat its response as authoritative for the final assistant message and parts.
+  - Streaming SSE is used for logs and (optionally) richer incremental trace capture.
+
+- **Working directory isolation:** execute each eval case attempt in its own filesystem directory (AgentV temp workspace). The OpenCode client MUST include the directory context so OpenCode operates within that directory.
+  - Rationale: reproducibility, parallelism, and preventing cross-contamination between eval cases.
+
+## Provider Lifecycle
+
+### Initialization
+- Resolve target settings (binary/executable path, server config, model selection, permissions behavior, logging options).
+- Start a local OpenCode server if no `baseUrl` is configured.
+  - Prefer a per-process server instance (shared by provider invocations) to reduce spawn overhead.
+  - The provider MUST avoid port collisions under parallel workers (either choose an ephemeral port, or allocate from a safe range).
+
+### Per-eval invocation
+For each `ProviderRequest`:
+1. Create/resolve the eval-case work directory (temp workspace).
+2. Create or reuse an OpenCode `sessionID` scoped to that directory.
+3. If streaming logs enabled, open the stream log file and subscribe to `client.event.subscribe({ directory })` and write JSONL.
+4. Send the prompt using `client.session.prompt({ sessionID, directory, system, parts, model?, agent?, tools? })`.
+5. Build `ProviderResponse` from the returned `parts` (and optionally from gathered SSE events).
+6. Tear down the SSE subscription for this invocation; keep the server alive for other requests.
+
+### Shutdown
+- Ensure spawned server processes are terminated on completion or abort.
+
+## Prompt & Message Mapping
+
+### Inputs
+AgentV provides:
+- `question` (formatted question string)
+- optional `systemPrompt`
+- optional `guidelines` (unwrapped content for non-agent providers)
+- optional `guideline_files` / `input_files` (paths, often represented as `file://` links for agent providers)
+- optional `chatPrompt` (multi-message)
+
+Mapping approach:
+- Prefer using `chatPrompt` when present.
+  - Convert AgentV roles into OpenCode `system` + `parts`.
+  - Include the final user query as a `text` part.
+- For filesystem-capable agent providers (including OpenCode), prefer referencing guideline and attachment files as file links rather than embedding large inline content.
+
+### Outputs
+OpenCode returns an assistant message with `parts` including:
+- `text` (assistant text)
+- `reasoning` (optional)
+- `tool` parts with `callID`, `tool`, and `state` (pending/running/completed/error)
+
+AgentV output mapping:
+- Construct a single final `OutputMessage` with:
+  - `role: "assistant"`
+  - `content: <concatenated assistant text parts>`
+  - `toolCalls: ToolCall[]` derived from `tool` parts:
+    - `id` = OpenCode `callID`
+    - `tool` = OpenCode `tool`
+    - `input` = `state.input` when present
+    - `output` = `state.output` when present (for completed)
+
+Optionally (future): emit separate `OutputMessage` entries for tool results, reasoning, or step boundaries. This is not required for initial tool-trajectory support.
+
+## Streaming Logs
+
+### Log content
+- Default format: JSONL where each line is a single OpenCode SSE event object.
+- MAY additionally include human-readable “summary” lines, but JSON objects MUST be preserved to keep tooling stable.
+
+### Log path publication
+- When the provider selects a log file path, it publishes `{ filePath, targetName, evalCaseId?, attempt? }` to a process-local tracker.
+
+## Permissions
+
+OpenCode can emit `permission.asked` events (e.g., filesystem writes, command execution).
+
+Initial policy:
+- Provide a target option to auto-approve permissions (`once` or `always`) or reject.
+- Default SHOULD be conservative (reject) unless explicitly enabled.
+
+## Risks / Trade-offs
+- **Port management / concurrency:** shared server improves performance but requires careful port selection and isolation.
+- **Trace fidelity:** relying on final `parts` is deterministic but may omit some intermediate streaming deltas.
+- **Permission behavior:** auto-approval increases convenience but raises safety risk; default should remain restrictive.
+
+## Migration Plan
+- Non-breaking addition: new provider kind and target schema fields are additive.
+- Existing targets remain valid.
+
+## Open Questions
+- Should AgentV support connecting to an externally-running OpenCode server (`baseUrl`) in addition to spawning a local server?
+- Should OpenCode be treated as an `AGENT_PROVIDER_KIND` (filesystem access) by default?
+- Which OpenCode “tools” should be enabled/disabled by default when running evals?
diff --git a/openspec/changes/add-opencode-log-streaming/proposal.md b/openspec/changes/add-opencode-log-streaming/proposal.md
@@ -0,0 +1,38 @@
+# Change: Add OpenCode provider support (with stream log artifacts)
+
+## Why
+AgentV currently supports agentic providers that operate over a filesystem and emit tool calls (e.g. Codex, Pi coding agent, VS Code subagent). OpenCode is another popular agent runtime with a well-defined event model (SSE) and structured tool lifecycle.
+
+To evaluate agentic behavior (especially with deterministic evaluators like `tool_trajectory`) AgentV needs:
+- A first-class `opencode` provider kind that can run OpenCode in an isolated per-eval workspace.
+- A stable mapping from OpenCode tool parts into AgentV `outputMessages/toolCalls`.
+- Debug visibility during execution, ideally via per-run stream logs that the CLI can surface early (Codex/Pi pattern).
+
+## What Changes
+- Add a new provider kind: `opencode`.
+- Define required/optional target configuration for OpenCode in `targets.yaml`.
+- Define how the OpenCode provider constructs prompts (system + parts) and executes within a per-eval-case work directory.
+- Define the mapping from OpenCode message parts (especially `tool` parts) into AgentV `ProviderResponse.outputMessages` and `ToolCall` fields.
+- Add a standard mechanism for an OpenCode provider to write per-run “stream logs” to disk (under `.agentv/logs/opencode/` by default).
+- Add a lightweight “log tracker” so the `agentv eval` CLI can surface OpenCode log file paths immediately (same pattern as Codex/Pi).
+- Define the expected log content at a high level (raw event JSONL is the default) so tooling remains stable even if OpenCode’s internal event structure evolves.
+
+## Non-Goals
+- Rich streaming UX in the AgentV terminal (token-by-token output).
+- OpenCode TUI integration.
+- Advanced OpenCode orchestration features beyond single-request evaluation (e.g., long-lived interactive sessions shared across evalcases).
+- New CLI flags or UI features beyond listing log file paths.
+
+## Impact
+- Affected specs:
+  - `evaluation` (OpenCode provider behavior, output mapping, and logging expectations)
+  - `validation` (targets schema updates for `provider: opencode`)
+  - `eval-cli` (CLI surfacing of provider log file paths)
+- Affected code (planned follow-up implementation):
+  - Core: OpenCode provider implementation, target schema updates, log tracker + exports
+  - CLI: subscribe and display OpenCode log paths
+
+## Compatibility
+- Non-breaking. Existing targets and providers are unaffected.
+- Logging remains optional (providers may omit log streaming when disabled or when directories cannot be created).
+
diff --git a/openspec/changes/add-opencode-log-streaming/specs/eval-cli/spec.md b/openspec/changes/add-opencode-log-streaming/specs/eval-cli/spec.md
@@ -0,0 +1,14 @@
+## ADDED Requirements
+
+### Requirement: Surface OpenCode provider log paths
+
+The CLI SHALL surface OpenCode provider log paths when they become available.
+
+#### Scenario: Print OpenCode log path when discovered
+- **WHEN** an OpenCode provider publishes a new log entry `{ filePath, targetName, evalCaseId?, attempt? }`
+- **THEN** the CLI prints the log file path in a dedicated “OpenCode logs” section
+- **AND** does not print duplicate log paths more than once
+
+#### Scenario: Continue printing progress while logs are emitted
+- **WHEN** OpenCode logs are printed while eval cases are running
+- **THEN** the CLI continues to print per-eval progress lines without requiring interactive cursor control
diff --git a/openspec/changes/add-opencode-log-streaming/specs/evaluation/spec.md b/openspec/changes/add-opencode-log-streaming/specs/evaluation/spec.md
@@ -0,0 +1,78 @@
+## ADDED Requirements
+
+### Requirement: OpenCode provider execution
+
+The system SHALL support an OpenCode-backed provider when a target is configured with `provider: opencode`.
+
+#### Scenario: Execute an eval case with OpenCode
+- **WHEN** a target is configured with `provider: opencode`
+- **AND** an eval case is executed for that target
+- **THEN** the system invokes OpenCode to generate an assistant response
+- **AND** runs OpenCode within an isolated per-eval-case working directory
+- **AND** returns a `ProviderResponse` with `outputMessages` populated
+
+#### Scenario: Provider fails cleanly when OpenCode is unavailable
+- **WHEN** an OpenCode target is selected
+- **AND** the OpenCode runtime cannot be started or reached (missing executable, failed server startup, unreachable base URL)
+- **THEN** the eval case attempt fails with an actionable error message
+- **AND** other eval cases continue when running in parallel
+
+### Requirement: OpenCode provider log streaming artifacts
+
+When an OpenCode-based provider run is executed, the system SHALL support writing a per-run stream log file and surfacing its path for debugging.
+
+#### Scenario: Provider creates an OpenCode stream log file
+- **WHEN** a provider run begins for an OpenCode-backed target
+- **THEN** the provider writes a log file under `.agentv/logs/opencode/` by default (or a configured override)
+- **AND** the provider appends progress entries as the agent executes
+
+#### Scenario: Provider disables OpenCode stream logging
+- **WHEN** OpenCode stream logging is disabled via configuration or environment
+- **THEN** the provider does not create a log file
+- **AND** evaluation continues normally
+
+#### Scenario: Provider cannot create the OpenCode log directory
+- **WHEN** the provider cannot create the configured log directory (permissions, invalid path)
+- **THEN** the provider continues without stream logs
+- **AND** emits a warning in verbose mode only
+
+### Requirement: OpenCode log path publication
+
+The system SHALL provide a mechanism to publish OpenCode log file paths so the CLI can present them to the user as soon as they are created.
+
+#### Scenario: Publish OpenCode log path at run start
+- **WHEN** the provider decides on a log file path for an OpenCode run
+- **THEN** it publishes `{ filePath, targetName, evalCaseId?, attempt? }` to a process-local log tracker
+- **AND** downstream consumers MAY subscribe to this tracker to display the log path
+
+### Requirement: OpenCode tool-call trace mapping
+
+The OpenCode provider SHALL map OpenCode tool lifecycle parts into AgentV tool calls so deterministic evaluators can operate on the trace.
+
+#### Scenario: Tool parts become toolCalls
+- **WHEN** OpenCode returns a response containing one or more `tool` parts
+- **THEN** the provider emits `ProviderResponse.outputMessages` containing `toolCalls`
+- **AND** each tool call includes `tool` name and `input` arguments when available
+- **AND** completed tool calls include `output` when available
+- **AND** tool call identifiers are stable across retries within an attempt when OpenCode provides them
+
+#### Scenario: Tool error parts are surfaced
+- **WHEN** OpenCode returns a `tool` part with error state
+- **THEN** the provider includes the tool call in `toolCalls`
+- **AND** includes the error information in a provider-specific metadata field or in `output` with a structured error payload
+
+### Requirement: OpenCode permission handling
+
+The OpenCode provider SHALL handle OpenCode permission requests deterministically based on target configuration.
+
+#### Scenario: Default permission policy is conservative
+- **WHEN** OpenCode emits a permission request during an eval case
+- **AND** the target does not explicitly enable auto-approval
+- **THEN** the provider rejects the request
+- **AND** the eval attempt fails with a clear error describing the blocked permission
+
+#### Scenario: Auto-approve permissions when configured
+- **WHEN** OpenCode emits a permission request during an eval case
+- **AND** the target is configured to auto-approve permissions
+- **THEN** the provider approves the request according to the configured policy (e.g., once or always)
+- **AND** execution continues normally
diff --git a/openspec/changes/add-opencode-log-streaming/specs/validation/spec.md b/openspec/changes/add-opencode-log-streaming/specs/validation/spec.md
@@ -0,0 +1,20 @@
+## ADDED Requirements
+
+### Requirement: Validate OpenCode targets
+
+The system SHALL validate OpenCode provider targets in `targets.yaml` using Zod schemas, rejecting unknown properties and accepting both snake_case and camelCase forms.
+
+#### Scenario: Accept a valid OpenCode target
+- **WHEN** a targets file contains a target with `provider: opencode`
+- **THEN** the configuration is accepted
+- **AND** the resolved config normalizes to camelCase
+
+#### Scenario: Reject unknown OpenCode target properties
+- **WHEN** an OpenCode target contains an unrecognized property (e.g., `streamlog_dir` instead of `stream_log_dir`)
+- **THEN** validation fails with an error identifying the unknown property path
+
+#### Scenario: Accept snake_case and camelCase equivalence for OpenCode settings
+- **WHEN** an OpenCode target uses `stream_logs` (snake_case)
+- **OR** uses `streamLogs` (camelCase)
+- **THEN** both are accepted as equivalent
+- **AND** the resolved config normalizes to `streamLogs`
diff --git a/openspec/changes/add-opencode-log-streaming/tasks.md b/openspec/changes/add-opencode-log-streaming/tasks.md
@@ -0,0 +1,18 @@
+## 1. Implementation
+- [ ] 1.1 Add new provider kind `opencode` (core provider registry + aliases)
+- [ ] 1.2 Extend targets schema to support `provider: opencode` and validate settings
+- [ ] 1.3 Implement OpenCode provider invocation (server lifecycle, per-eval-case directory, prompt execution)
+- [ ] 1.4 Map OpenCode `tool` parts into AgentV `outputMessages/toolCalls` for trace-based evaluators
+- [ ] 1.5 Add OpenCode stream log writer (JSONL) and log path tracker (record/consume/subscribe)
+- [ ] 1.6 Export OpenCode log tracker functions from provider index
+- [ ] 1.7 Update `agentv eval` CLI to subscribe and print OpenCode log paths (no duplicates)
+
+## 2. Validation
+- [ ] 2.1 Run `openspec validate add-opencode-log-streaming --strict`
+- [ ] 2.2 Add/update unit tests for:
+	- [ ] targets schema parsing for `opencode` targets
+	- [ ] tool-call mapping from OpenCode parts → AgentV `ToolCall`
+	- [ ] log tracker dedupe behavior (CLI subscriber)
+
+## 3. Documentation
+- [ ] 3.1 Update any relevant skill/docs (if the project uses them for provider setup)