From 03b29f5e5a6d7887abe83cc20d382c73bd0052c8 Mon Sep 17 00:00:00 2001 From: Christopher Tso Date: Wed, 7 Jan 2026 15:53:40 +1100 Subject: [PATCH] openspec: add copilot-cli provider proposal --- .../add-copilot-cli-provider/design.md | 87 +++++++++++++++++++ .../add-copilot-cli-provider/proposal.md | 36 ++++++++ .../specs/evaluation/spec.md | 12 +++ .../specs/validation/spec.md | 14 +++ .../changes/add-copilot-cli-provider/tasks.md | 24 +++++ 5 files changed, 173 insertions(+) create mode 100644 openspec/changes/add-copilot-cli-provider/design.md create mode 100644 openspec/changes/add-copilot-cli-provider/proposal.md create mode 100644 openspec/changes/add-copilot-cli-provider/specs/evaluation/spec.md create mode 100644 openspec/changes/add-copilot-cli-provider/specs/validation/spec.md create mode 100644 openspec/changes/add-copilot-cli-provider/tasks.md diff --git a/openspec/changes/add-copilot-cli-provider/design.md b/openspec/changes/add-copilot-cli-provider/design.md new file mode 100644 index 00000000..136d80f8 --- /dev/null +++ b/openspec/changes/add-copilot-cli-provider/design.md @@ -0,0 +1,87 @@ +## Context + +AgentV supports multiple provider kinds: +- Cloud LLM providers (Azure OpenAI, Anthropic, Gemini) +- Agent-style providers that operate on a workspace (Codex CLI, VS Code Copilot, Claude Code, Pi) + +GitHub Copilot provides a CLI package (`@github/copilot`) that can be invoked via `npx` and interacted with through stdin/stdout. AgentV can adopt this pattern for evaluation runs. + +## Goals +- Add a built-in provider kind that runs GitHub Copilot CLI (`@github/copilot`) as an external process. +- Keep configuration minimal and consistent with existing CLI-style providers (especially `codex`). +- Ensure deterministic capture of the “candidate answer” with good error messages and artifacts. + +## Proposed Provider Identity +- Canonical kind: `copilot-cli` +- Accepted aliases (to reduce user friction): `copilot`, `github-copilot` + +Rationale: `copilot` alone is ambiguous with the VS Code Copilot provider; `copilot-cli` makes intent explicit. + +## Invocation Strategy + +### Base command +Default to invoking Copilot via npm: +- `npx -y @github/copilot@` + +Rationale: +- Avoid requiring a global install. +- Match vibe-kanban’s approach. +- Pin a version to reduce behavior drift across runs. + +### Process I/O +- Write the rendered prompt to stdin, then close stdin. +- Capture stdout/stderr. + +### Log directory +- Pass `--log-dir ` and `--log-level debug` when supported. +- Record the log directory path in the ProviderResponse metadata for debugging. + +### Timeout & cancellation +- Support a target-configured timeout (seconds or ms consistent with AgentV conventions). +- Abort via `AbortSignal` if provided by the orchestrator. + +## Prompt Construction +Copilot CLI runs in a workspace directory, so the provider should follow the same “agent provider preread” pattern used by `vscode` and `codex`: +- Include a preread section that links guideline and attachment files via `file://` URLs. +- Include the user query. + +## Response Extraction +Copilot CLI’s stdout is expected to contain a mixture of progress text and the final assistant message. + +Proposed minimal extraction algorithm: +- Strip ANSI escape sequences. +- Trim surrounding whitespace. +- Treat the remaining stdout content as the candidate answer. +- Preserve full stdout/stderr as artifacts on failures. + +If Copilot CLI later provides a stable, documented structured output mode, AgentV MAY add opt-in support in a future change. + +## Target Configuration Surface +Keep this comparable to `codex`: +- `provider: copilot-cli` +- `settings.executable` (optional): defaults to `npx` +- `settings.args` (optional): appended args; default includes `-y @github/copilot@` and flags +- `settings.cwd` (optional) +- `settings.timeoutSeconds` (optional) +- `settings.env` (optional) +- `settings.model` (optional) + +Avoid overfitting to every Copilot CLI flag initially; allow passthrough args for advanced use. + +## Security & Safety Notes +- Like other agent providers, Copilot CLI can read local files from the workspace. +- Any “allow all tools” behavior (if exposed) should be opt-in and clearly documented. +- Prefer defaulting to safer settings, consistent with existing providers. + +## Testing Strategy (implementation stage) +- Unit tests for: + - command argument construction + - stdout parsing/extraction + - timeout handling +- Integration-style tests (mock runner) that simulate Copilot CLI stdout/stderr. + +## Alternatives Considered +- Use `gh copilot`: + - Rejected: requested explicitly to use `@github/copilot` like vibe-kanban. +- Implement as `cli` provider template: + - Rejected: would push complexity to users and lose built-in prompt construction and artifacts. diff --git a/openspec/changes/add-copilot-cli-provider/proposal.md b/openspec/changes/add-copilot-cli-provider/proposal.md new file mode 100644 index 00000000..c0707fe8 --- /dev/null +++ b/openspec/changes/add-copilot-cli-provider/proposal.md @@ -0,0 +1,36 @@ +# Change: Add GitHub Copilot CLI provider + +## Why +AgentV currently supports running agent-style evaluations via `provider: vscode` (VS Code) and `provider: codex` (Codex CLI), but it does not support the GitHub Copilot CLI package (`@github/copilot`). Teams that standardize on Copilot CLI (often via `npx -y @github/copilot`) cannot evaluate the same prompts/tasks in AgentV without custom wrappers. + +This change adds a first-class `copilot-cli` provider so AgentV can invoke Copilot CLI directly and capture responses for evaluation. + +## What Changes +- Add a new target provider kind: `copilot-cli` (GitHub Copilot CLI via `@github/copilot`). +- Add target configuration fields for Copilot CLI execution (command/executable, args, model, timeout, cwd, env). +- Implement provider execution by spawning the Copilot CLI process, piping a constructed prompt to stdin, and capturing the final assistant response from stdout. +- Persist provider artifacts (stdout/stderr and optional log-dir files) for debugging on failures. +- Update documentation/templates so `agentv init` guidance includes Copilot CLI targets. + +## Non-Goals +- Do not add `gh copilot` (GitHub CLI subcommand) support in this change. +- Do not add interactive “resume session” UX; evaluations run as independent invocations. +- Do not introduce a new plugin system; this remains a built-in provider like `codex`/`vscode`. + +## Impact +- Affected specs: + - `evaluation` (new provider invocation behavior) + - `validation` (targets schema/validation for the new provider) +- Affected code (implementation stage): + - `packages/core/src/evaluation/providers/*` (new provider + provider registry) + - `packages/core/src/evaluation/validation/targets-validator.ts` + - `apps/cli` docs/templates (provider list + examples) + +## Compatibility +- Backwards compatible: existing targets continue to work unchanged. +- `provider: copilot-cli` is additive. + +## Decisions +- Canonical provider kind: `copilot-cli`. +- Accepted provider aliases: `copilot` and `github-copilot`. +- Output contract: unless/ until Copilot CLI exposes a stable machine-readable mode that AgentV supports, the provider treats Copilot CLI stdout as the candidate answer after stripping ANSI escapes and trimming surrounding whitespace. diff --git a/openspec/changes/add-copilot-cli-provider/specs/evaluation/spec.md b/openspec/changes/add-copilot-cli-provider/specs/evaluation/spec.md new file mode 100644 index 00000000..9e3e7e90 --- /dev/null +++ b/openspec/changes/add-copilot-cli-provider/specs/evaluation/spec.md @@ -0,0 +1,12 @@ +## MODIFIED Requirements + +### Requirement: Provider Integration +The system SHALL integrate with supported providers using target configuration and optional retry settings. + +#### Scenario: GitHub Copilot CLI provider +- **WHEN** a target uses `provider: copilot-cli` (or an accepted alias) +- **THEN** the system ensures the Copilot CLI launcher is available (defaulting to `npx` when not explicitly configured) +- **AND** builds a preread prompt document that links guideline and attachment files via `file://` URLs and includes the user query +- **AND** runs GitHub Copilot CLI via `@github/copilot` with a pinned version by default (configurable), piping the prompt via stdin +- **AND** captures stdout/stderr and extracts a single candidate answer text from the final assistant output +- **AND** on failure, the error includes exit code/timeout context and preserves stdout/stderr and any log artifacts for debugging diff --git a/openspec/changes/add-copilot-cli-provider/specs/validation/spec.md b/openspec/changes/add-copilot-cli-provider/specs/validation/spec.md new file mode 100644 index 00000000..2f632375 --- /dev/null +++ b/openspec/changes/add-copilot-cli-provider/specs/validation/spec.md @@ -0,0 +1,14 @@ +## MODIFIED Requirements + +### Requirement: Targets File Schema Validation +The system SHALL validate target configuration using Zod schemas that serve as both runtime validators and TypeScript type sources. + +#### Scenario: Unknown Copilot CLI provider property rejected +- **WHEN** a targets file contains a Copilot CLI target with an unrecognized property +- **THEN** the system SHALL reject the configuration with an error identifying the unknown property + +#### Scenario: Copilot CLI provider accepts snake_case and camelCase settings +- **WHEN** a targets file uses `provider: copilot-cli` (or an accepted alias) +- **AND** configures supported settings using either snake_case or camelCase +- **THEN** validation succeeds +- **AND** the resolved config normalizes to camelCase diff --git a/openspec/changes/add-copilot-cli-provider/tasks.md b/openspec/changes/add-copilot-cli-provider/tasks.md new file mode 100644 index 00000000..04ee61f3 --- /dev/null +++ b/openspec/changes/add-copilot-cli-provider/tasks.md @@ -0,0 +1,24 @@ +## 1. Provider + Targets +- [ ] 1.1 Add `copilot-cli` to `ProviderKind`, `KNOWN_PROVIDERS`, and `AGENT_PROVIDER_KINDS` (and decide aliases). +- [ ] 1.2 Extend target parsing to recognize `provider: copilot-cli` (and chosen aliases) and resolve a typed Copilot config. +- [ ] 1.3 Extend `targets-validator` to accept the Copilot settings keys and reject unknown properties with actionable errors. + +## 2. Execution +- [ ] 2.1 Implement `CopilotCliProvider` (mirroring patterns from `CodexProvider`): spawn process, write prompt to stdin, capture stdout/stderr, enforce timeout. +- [ ] 2.2 Implement prompt preread rendering consistent with other agent providers (file:// links for guidelines and attachments). +- [ ] 2.3 Implement robust stdout parsing to extract a single candidate answer; preserve raw artifacts on errors. +- [ ] 2.4 Register provider in provider factory/registry. + +## 3. Docs + Templates +- [ ] 3.1 Update CLI docs to list `copilot-cli` as a supported provider and add a minimal `targets.yaml` example. +- [ ] 3.2 Update `apps/cli/src/templates/.claude/skills/agentv-eval-builder/` references so `agentv init` users get Copilot CLI guidance. + +## 4. Tests +- [ ] 4.1 Add unit tests for config resolution and argument rendering. +- [ ] 4.2 Add provider tests using a mocked runner (no real Copilot CLI dependency) for success, invalid output, and timeout. + +## 5. Validation +- [ ] 5.1 Run `bun run build`, `bun run typecheck`, `bun run lint`, `bun test`. + +## 6. Release hygiene +- [ ] 6.1 Add a changeset if user-visible behavior changes should ship in the next release.