Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 0 additions & 37 deletions experiments/osworld_docker_test.py

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ The main entry point `experiments/run_osworld.py` is currently configured with h
2. **Environment Variables:**
- `AGENTLAB_DEBUG=1`: Automatically runs the debug subset (7 tasks from `osworld_debug_task_ids.json`)

### Running OSWorld Tasks
### Task subsets

We provide different subsets of tasks:

Expand All @@ -42,10 +42,28 @@ We provide different subsets of tasks:
### Example Commands

```bash
# Run with default debug subset (7 tasks)
# Run with default debug subset using sequential execution in VMware VM
python experiments/run_osworld.py
```

### Parallel Execution with Docker
To run OSWorld in parallel using Docker, ensure you have Docker installed and configured.
To install it, follow the section from the OSWorld README on [Docker setup](https://github.com/xlang-ai/OSWorld?tab=readme-ov-file#docker-server-with-kvm-support-for-better-performance).
Ensure that your docker installation support KVM, as OSWorld requires it for running VMs.
We also recommend pulling the latest Docker image for OSWorld before running the benchmark:

```bash
docker pull happysixd/osworld-docker
```

After setting up Docker, you can change the `use_vmware` parameter in the script to `False` and run:

```bash
python experiments/run_osworld.py
```
You can control number of parallel jobs by setting the `n_jobs` parameter in the script, the default is 4.
We recommend setting `n_jobs` to `your_number_of_cpu_cores - 2` to leave some resources for the host system and the benchmark itself.


### Configuration Notes

Expand Down