Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 13 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,11 +212,19 @@ flowchart TB
%% MODEL LAYER
%% ═══════════════════════════════════════════════════════════════════════
subgraph Models["Model Layer (VLMs)"]
direction LR
CLAUDE["Claude"]
GPT["GPT-4o"]
GEMINI["Gemini"]
QWEN["Qwen-VL"]
direction TB
subgraph APIModels["API Models"]
direction LR
CLAUDE["Claude"]
GPT["GPT-4o"]
GEMINI["Gemini"]
end
subgraph OpenSource["Open Source / Fine-tuned"]
direction LR
QWEN3["Qwen3-VL"]
UITARS["UI-TARS"]
OPENCUA["OpenCUA"]
end
end

%% ═══════════════════════════════════════════════════════════════════════
Expand Down
1,609 changes: 741 additions & 868 deletions docs/architecture-evolution.md

Large diffs are not rendered by default.

40 changes: 20 additions & 20 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,26 +79,26 @@ flowchart TB

```mermaid
flowchart LR
subgraph Record["1. Record"]
A[User Demo] --> B[Capture Session]
B --> C[Screenshots + Events]
subgraph Demonstrate["1. Demonstrate"]
A[Human Trajectory] --> B[Capture Session]
B --> C[Observations + Actions]
end

subgraph Store["2. Store"]
C --> D[JSON/Parquet Files]
D --> E[Demo Library]
D --> E[Demonstration Library]
end

subgraph Train["3. Train"]
E --> F[Data Loading]
F --> G[Model Training]
subgraph Learn["3. Learn"]
E --> F[Trajectory Abstraction]
F --> G[Policy Learning]
G --> H[Checkpoint]
end

subgraph Deploy["4. Deploy"]
H --> I[Agent Policy]
subgraph Execute["4. Execute"]
H --> I[Trained Policy]
I --> J[Inference]
J --> K[Action Replay]
J --> K[Agent Deployment]
end

subgraph Evaluate["5. Evaluate"]
Expand Down Expand Up @@ -164,17 +164,17 @@ graph TD

| Package | Responsibility | Key Exports |
|---------|---------------|-------------|
| **openadapt-capture** | GUI recording, event capture, storage | `CaptureSession`, `Recorder`, `Action` |
| **openadapt-ml** | Model training, inference, adapters | `QwenVLAdapter`, `Trainer`, `AgentPolicy` |
| **openadapt-capture** | Demonstration collection, observation-action capture, storage | `CaptureSession`, `Recorder`, `Action` |
| **openadapt-ml** | Policy learning, training, inference | `QwenVLAdapter`, `Trainer`, `AgentPolicy` |
| **openadapt-evals** | Benchmark evaluation, metrics | `ApiAgent`, `BenchmarkAdapter`, `evaluate_agent_on_benchmark` |
| **openadapt-viewer** | HTML visualization, replay viewer | `PageBuilder`, `HTMLBuilder` |
| **openadapt-viewer** | Trajectory visualization | `PageBuilder`, `HTMLBuilder` |

### Optional Packages

| Package | Responsibility | Use Case |
|---------|---------------|----------|
| **openadapt-grounding** | UI element localization | Improved click accuracy with element detection |
| **openadapt-retrieval** | Multimodal demo search | Find similar demonstrations for few-shot prompting |
| **openadapt-grounding** | UI element grounding | Improved action accuracy with element detection |
| **openadapt-retrieval** | Multimodal trajectory search | Find similar demonstrations for few-shot policy learning |
| **openadapt-privacy** | PII/PHI scrubbing | Redact sensitive data before storage/training |

## Evaluation Loop
Expand Down Expand Up @@ -275,14 +275,14 @@ graph LR
pip install openadapt

# Individual packages
pip install openadapt[capture] # GUI capture/recording
pip install openadapt[ml] # ML training and inference
pip install openadapt[capture] # Demonstration collection
pip install openadapt[ml] # Policy learning and inference
pip install openadapt[evals] # Benchmark evaluation
pip install openadapt[viewer] # HTML visualization
pip install openadapt[viewer] # Trajectory visualization

# Optional packages
pip install openadapt[grounding] # UI element localization
pip install openadapt[retrieval] # Demo search/retrieval
pip install openadapt[grounding] # UI element grounding
pip install openadapt[retrieval] # Trajectory retrieval
pip install openadapt[privacy] # PII/PHI scrubbing

# Bundles
Expand Down
Binary file modified docs/assets/architecture-diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
34 changes: 17 additions & 17 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,11 @@ This verifies:

## Capture Commands

Commands for recording user demonstrations.
Commands for collecting human demonstrations.

### capture start

Start a new recording session.
Start a new demonstration collection session.

```bash
openadapt capture start --name <name> [options]
Expand All @@ -64,25 +64,25 @@ openadapt capture start --name <name> [options]
**Examples:**

```bash
# Basic recording
# Basic demonstration collection
openadapt capture start --name login-task

# Recording without screenshots
# Demonstration collection without screenshots
openadapt capture start --name audio-task --no-screenshots

# Recording with slower screenshot interval
# Demonstration collection with slower screenshot interval
openadapt capture start --name slow-task --interval 1.0
```

### capture stop

Stop the current recording.
Stop the current demonstration collection.

```bash
openadapt capture stop
```

Alternatively, press `Ctrl+C` in the recording terminal.
Alternatively, press `Ctrl+C` in the capture terminal.

### capture list

Expand All @@ -103,7 +103,7 @@ form-fill 89 5m 42s 2026-01-14

### capture view

Open the viewer for a capture.
Open the trajectory viewer for a demonstration.

```bash
openadapt capture view <name> [options]
Expand All @@ -113,13 +113,13 @@ openadapt capture view <name> [options]

| Argument | Required | Description |
|----------|----------|-------------|
| `<name>` | Yes | Name of the capture to view |
| `<name>` | Yes | Name of the demonstration to view |
| `--port` | No | Server port (default: 8080) |
| `--no-browser` | No | Don't open browser automatically |

### capture delete

Delete a capture.
Delete a demonstration.

```bash
openadapt capture delete <name>
Expand All @@ -129,11 +129,11 @@ openadapt capture delete <name>

## Train Commands

Commands for training ML models.
Commands for policy learning from demonstrations.

### train start

Start training a model on a capture.
Start policy learning from a demonstration.

```bash
openadapt train start --capture <name> --model <model> [options]
Expand All @@ -143,7 +143,7 @@ openadapt train start --capture <name> --model <model> [options]

| Argument | Required | Description |
|----------|----------|-------------|
| `--capture` | Yes | Name of the capture to train on |
| `--capture` | Yes | Name of the demonstration to train on |
| `--model` | Yes | Model architecture |
| `--epochs` | No | Number of training epochs (default: 10) |
| `--batch-size` | No | Batch size (default: 4) |
Expand All @@ -159,10 +159,10 @@ openadapt train start --capture <name> --model <model> [options]
**Examples:**

```bash
# Basic training
# Basic policy learning
openadapt train start --capture login-task --model qwen3vl-2b

# Training with custom parameters
# Policy learning with custom parameters
openadapt train start \
--capture login-task \
--model qwen3vl-7b \
Expand All @@ -173,7 +173,7 @@ openadapt train start \

### train status

Check training progress.
Check policy learning progress.

```bash
openadapt train status
Expand All @@ -191,7 +191,7 @@ ETA: 15 minutes

### train stop

Stop the current training.
Stop the current policy learning.

```bash
openadapt train stop
Expand Down
Loading