Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
154 commits
Select commit Hold shift + click to select a range
5e371e5
Init template for Overcooked environment implementation
mmbajo Aug 11, 2025
c93efef
Refactor Overcooked environment for single-agent gameplay
mmbajo Aug 12, 2025
55d0987
Update .gitignore to include dsym files
mmbajo Sep 10, 2025
4cd598c
Refactor Overcooked environment dimensions and grid layout
mmbajo Sep 10, 2025
319e514
Fix position validation in Overcooked environment
mmbajo Sep 10, 2025
8b16850
Add interaction handling in Overcooked environment
mmbajo Sep 10, 2025
72641d6
Add item management and agent color rendering in Overcooked environment
mmbajo Sep 11, 2025
3eefe9f
Add new assets for Overcooked environment
mmbajo Sep 11, 2025
e2d3138
Enhance Overcooked environment with detailed texture support
mmbajo Sep 11, 2025
c5a93b8
Update chef sprite textures in Overcooked environment
mmbajo Sep 11, 2025
9fa3cfd
Edit the rendering logic for ingredient box. We only use Onions for now.
mmbajo Sep 11, 2025
816b92b
Add cooking mechanics to Overcooked environment
mmbajo Sep 11, 2025
5f694b7
Enhance Overcooked gameplay with plated soup mechanics
mmbajo Sep 11, 2025
55a892c
Add Overcooked configuration file
mmbajo Sep 12, 2025
190fd9d
Update Overcooked configuration to adjust agent settings
mmbajo Sep 12, 2025
b17af77
Enhance Overcooked environment for multi-agent gameplay
mmbajo Sep 12, 2025
1f4443b
Refactor observation handling in Overcooked environment
mmbajo Sep 12, 2025
cd66213
Refactor observation structure in Overcooked environment
mmbajo Sep 12, 2025
c796718
Add dish evaluation and reward system in Overcooked environment
mmbajo Sep 12, 2025
8faebff
Update Overcooked environment configuration for gameplay balance
mmbajo Sep 12, 2025
3c7f9cb
Add dish serving evaluation and enhance rendering in Overcooked envir…
mmbajo Sep 12, 2025
7e9ca00
Add Overcooked game type to environment configuration
mmbajo Sep 13, 2025
40f01d9
Implement neural network support in Overcooked environment
mmbajo Sep 13, 2025
3849caf
Add user-defined statistics tracking in Overcooked environment
mmbajo Sep 13, 2025
18176f6
Refactor dish evaluation logic in Overcooked environment
mmbajo Sep 13, 2025
d2f7ff7
Refactor dish evaluation logic in Overcooked environment
mmbajo Sep 13, 2025
548fb50
Enhance observation structure in Overcooked environment
mmbajo Sep 14, 2025
cfc3638
Refactor observation computation in Overcooked environment
mmbajo Sep 16, 2025
5bdc016
Initialize episode counter and update performance metrics in Overcook…
mmbajo Sep 16, 2025
58f9ac7
Update observation vector size in Overcooked environment
mmbajo Sep 16, 2025
1457f65
Attempt fix for proper logging of user stats
mmbajo Sep 16, 2025
6acfb5d
Reset log fields in Overcooked environment to ensure accurate trackin…
mmbajo Sep 18, 2025
1ae9cde
Add sweep configuration for training parameters in Overcooked environ…
mmbajo Sep 18, 2025
82857d8
Enhance item handling in Overcooked environment
mmbajo Sep 19, 2025
015f1b9
Fix item drop condition in Overcooked environment
mmbajo Sep 19, 2025
675776f
Include rewards in obs for better credit assignment in Overcooked env.
mmbajo Sep 19, 2025
0e28c2d
Enhance observation vector in Overcooked environment
mmbajo Sep 20, 2025
d2b6eca
Refactor distance calculations and observation handling in Overcooked…
mmbajo Sep 20, 2025
6bee4b3
Update coordinate types and frame rate in Overcooked environment
mmbajo Sep 20, 2025
55a22b8
Add wall texture and update grid representation in Overcooked environ…
mmbajo Sep 20, 2025
6e06f2f
Refactor proximity feature calculations in Overcooked environment
mmbajo Sep 20, 2025
f534a35
Refactor item type definitions and enhance observation panel in Overc…
mmbajo Sep 20, 2025
4a0413b
Update observation size and enhance position calculation in Overcooke…
mmbajo Sep 20, 2025
c785ee9
Refactor absolute position calculation in compute_observations function
mmbajo Sep 20, 2025
13a4861
Refactor and clean up Overcooked.h file
mmbajo Sep 22, 2025
c63f6de
Update reward system and function signatures in Overcooked environment
mmbajo Sep 22, 2025
4df28fd
Merge branch '3.0' into roze-overcooked-dev
mmbajo Sep 22, 2025
0107e7b
Update training parameters in Overcooked configuration
mmbajo Sep 22, 2025
60d47cb
This config gets over 0.5 explained variance!
mmbajo Sep 22, 2025
324689e
Update Overcooked environment for single agent gameplay
mmbajo Sep 26, 2025
30f44fb
Test 1 agent config to verify learning - still cant learn fully
mmbajo Sep 26, 2025
7cb4dbc
Remove teammate mirroring since its redundant - we put everything int…
mmbajo Oct 7, 2025
24b4fdb
Add TODO comments for ingredient handling in Overcooked environment
mmbajo Oct 9, 2025
896912a
Add TODO comment to generalize reward handling in evaluate_dish_serve…
mmbajo Oct 9, 2025
3b08355
Remove debug observation printing and unused debug flag from Overcook…
mmbajo Oct 9, 2025
e15e469
Add README for Overcooked environment -> mainly describes reward and …
mmbajo Oct 9, 2025
3435db1
Update training parameters
mmbajo Oct 9, 2025
6add1dc
Update readme
mmbajo Oct 9, 2025
d2fe6c0
Bugfix: Update ingredient limits and observation logic in Overcooked …
mmbajo Oct 10, 2025
94045ff
Refactor reward system in Overcooked environment
mmbajo Oct 10, 2025
3dd554e
Refactor cooking state management in Overcooked environment
mmbajo Oct 10, 2025
48a7ce7
Fix dish serving logic in Overcooked environment
mmbajo Oct 10, 2025
70c717b
Fix wall detection logic in Overcooked environment
mmbajo Oct 11, 2025
e8a112a
Merge remote-tracking branch 'upstream/3.0' into roze-overcooked-dev
mmbajo Dec 6, 2025
038121a
Add reward for ingredient picked in Overcooked environment
mmbajo Dec 6, 2025
6983c2a
Working commit! Update Overcooked environment configuration and logging
mmbajo Dec 26, 2025
4d3c1b4
Refactor Overcooked environment setup and configuration
mmbajo Dec 26, 2025
c296183
Update Overcooked weights binary file
mmbajo Dec 26, 2025
bd1370f
Update path for Overcooked weights binary file location
mmbajo Dec 26, 2025
2d45c82
Update Overcooked weights binary file to new version
mmbajo Dec 27, 2025
9dea55b
Increase target FPS in Overcooked rendering
mmbajo Dec 27, 2025
b79ab2e
Add Overcooked types header file
mmbajo Dec 28, 2025
61541c4
Add Overcooked items management functions
mmbajo Dec 28, 2025
2501853
Add Overcooked observations header
mmbajo Dec 28, 2025
55a12ec
Add Overcooked game logic header
mmbajo Dec 28, 2025
b78556c
Add Overcooked rendering functionality
mmbajo Dec 28, 2025
72631d5
Add include guards to Overcooked header
mmbajo Dec 28, 2025
78f643a
Update Overcooked README to reflect changes in observation and action…
mmbajo Dec 28, 2025
dce3fcd
Add StaticCache struct to overcooked_types.h for optimized tile posit…
mmbajo Dec 28, 2025
eb05d34
Add cache field to Overcooked struct for efficient tile position lookup
mmbajo Dec 28, 2025
2c5a03f
Speed-up attempt by caching. Add static cache initialization in Overc…
mmbajo Dec 28, 2025
43425ec
Speed-up attempt by caching. Add cached tile proximity computation to…
mmbajo Dec 28, 2025
3830ccc
Breaking change! Refactor observation computation in Overcooked to ut…
mmbajo Dec 28, 2025
f722014
Breaking change! Caching! Refactor observation computation in Overcoo…
mmbajo Dec 28, 2025
a26bca9
Breaking Change! Cache SERVING_AREA. Refactor compute_observations to…
mmbajo Dec 28, 2025
4067579
Breaking change! Cache STOVE position.
mmbajo Dec 28, 2025
776ef56
Breaking change! Cached Pot position!
mmbajo Dec 28, 2025
331242f
Refactor find_nearest_empty_counter to use cached counter positions f…
mmbajo Dec 28, 2025
7b0b91c
Refactor proximity feature computation for plated soup in Overcooked
mmbajo Dec 28, 2025
75ed4f6
Add pot_index_grid to Overcooked struct for mapping grid cells to pot…
mmbajo Dec 28, 2025
7bf5eda
Add pot index initialization to Overcooked environment setup
mmbajo Dec 28, 2025
a5c606c
Add fast pot lookup function to Overcooked for improved efficiency
mmbajo Dec 28, 2025
abf7749
Optimize pot retrieval in Overcooked by replacing get_pot_at with get…
mmbajo Dec 28, 2025
4216b16
Optimize pot retrieval in compute_observations by replacing get_pot_a…
mmbajo Dec 28, 2025
774c80a
Remove deprecated get_pot_at function from Overcooked, streamlining p…
mmbajo Dec 28, 2025
af52b6b
Free allocated memory for pot_index_grid in c_close function to preve…
mmbajo Dec 28, 2025
7edf4d2
Refactor pot retrieval in Overcooked to use get_pot_at function for c…
mmbajo Dec 28, 2025
3690f78
Refactor for speed-up. Add item grid initialization to Overcooked env…
mmbajo Dec 29, 2025
1b286aa
Add fast item lookup and reset functionality in Overcooked
mmbajo Dec 29, 2025
a47368f
Enhance item management in Overcooked with optimized add and remove f…
mmbajo Dec 29, 2025
f98969e
Remove deprecated get_item_at function from Overcooked, streamlining …
mmbajo Dec 29, 2025
4b0659b
Rename get_item_at_fast to get_item_at for clarity in item retrieval …
mmbajo Dec 29, 2025
b81903d
Optimize observation clearing in Overcooked by replacing the manual l…
mmbajo Dec 29, 2025
483d0aa
Refactor find_nearest_plated_soup function in Overcooked to return th…
mmbajo Dec 29, 2025
c5e37da
Add agent_position_mask to Overcooked types for agent presence tracking
mmbajo Dec 29, 2025
db93386
Add agent position management functions in Overcooked
mmbajo Dec 29, 2025
fea5e72
Initialize agent positions during environment reset in Overcooked
mmbajo Dec 29, 2025
dffedd6
Enhance agent movement management in Overcooked
mmbajo Dec 29, 2025
709ed5b
Refactor position validation logic in Overcooked
mmbajo Dec 29, 2025
2542385
Refactor agent structure in Overcooked for improved clarity
mmbajo Dec 29, 2025
9eea34b
Enhance alignment of Client struct in Overcooked
mmbajo Dec 29, 2025
848667b
Update Item struct alignment and data types in Overcooked
mmbajo Dec 29, 2025
76e999f
Update README.md for Overcooked environment
mmbajo Dec 29, 2025
b64852f
Add ASYMMETRIC_ADVANTAGES constant to Overcooked types
mmbajo Dec 29, 2025
4df205b
Add layout definitions and spawn position management in Overcooked types
mmbajo Dec 29, 2025
f1d5773
Add layout utility functions in Overcooked types
mmbajo Dec 29, 2025
d3ca897
Refactor grid parsing in Overcooked to utilize layout information
mmbajo Dec 29, 2025
165a330
Update agent spawn logic in Overcooked to utilize layout information
mmbajo Dec 29, 2025
aff7e43
Refactor agent respawn logic in Overcooked to utilize layout spawn po…
mmbajo Dec 29, 2025
9bc2ef6
Initialize Overcooked environment dimensions based on layout information
mmbajo Dec 29, 2025
fee30b1
Enhance Overcooked main function to support dynamic layout selection
mmbajo Dec 29, 2025
0576981
Update Overcooked environment initialization to set layout_id
mmbajo Dec 29, 2025
2751bfa
Refactor Overcooked environment initialization to support layout sele…
mmbajo Dec 29, 2025
2cae358
Update Overcooked configuration to include layout parameter and enhan…
mmbajo Dec 29, 2025
967da8c
Implement dynamic weight selection for Overcooked based on layout
mmbajo Dec 29, 2025
dfe8d4c
Update agent position mask type in Overcooked to uint64_t for improve…
mmbajo Dec 29, 2025
ba5231a
Add FORCED_COORDINATION layout
mmbajo Dec 29, 2025
06de006
Add forced coordination layout to Overcooked types
mmbajo Dec 29, 2025
ec47f7e
Add weight file and configuration for forced coordination layout in O…
mmbajo Dec 29, 2025
f3b1ed4
Add function to find nearest item by type in Overcooked
mmbajo Dec 29, 2025
7d4a563
Update observation dimensions in Overcooked to 43
mmbajo Dec 29, 2025
6ca2a59
Add plate picked reward to Overcooked environment
mmbajo Dec 30, 2025
ac2e049
Update Overcooked configuration for forced coordination layout
mmbajo Dec 30, 2025
c3a2d5d
Update weights size for forced coordination layout in Overcooked
mmbajo Dec 30, 2025
b2f9b1b
Fix weights size for Overcooked configuration
mmbajo Dec 30, 2025
14c420e
Update weights size for asymmetric advantages layout in Overcooked
mmbajo Dec 30, 2025
eab5bc8
Add COORDINATION_RING layout to Overcooked types
mmbajo Dec 30, 2025
f993a24
Add coordination ring layout to Overcooked configuration
mmbajo Dec 30, 2025
96cfd07
Add new layouts to Overcooked README
mmbajo Dec 30, 2025
ff89120
Update weights file for coordination ring layout in Overcooked
mmbajo Dec 30, 2025
d4182b2
Add counter circuit layout to Overcooked configuration
mmbajo Dec 30, 2025
505ae2b
Add binary weights file for Overcooked layout
mmbajo Dec 30, 2025
365c01c
Fix observation display in Overcooked rendering
mmbajo Dec 30, 2025
4f57557
Fix logging of episode_return in Overcooked
mmbajo Dec 31, 2025
df0d720
Enhance reward calculation in dish serving logic for Overcooked
mmbajo Jan 1, 2026
698145c
Update binary weights file for Overcooked
mmbajo Jan 1, 2026
10f20da
Update COUNTER_CIRCUIT binary weights file for Overcooked
mmbajo Jan 3, 2026
efd033b
Update FORCED_COORDINATION binary weights file for Overcooked
mmbajo Jan 3, 2026
63f7480
Update COORDINATION_RING binary weights file for Overcooked
mmbajo Jan 3, 2026
f71b9cb
Update Overcooked configuration
mmbajo Jan 3, 2026
794df59
Enhance ingredient rendering in Overcooked
mmbajo Jan 3, 2026
fa9112a
Refactor cooking texture logic in Overcooked rendering
mmbajo Jan 3, 2026
398d89e
Refactor Overcooked rendering logic
mmbajo Jan 3, 2026
105275d
Remove unused agent rendering logic in Overcooked
mmbajo Jan 3, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -162,3 +162,6 @@ pufferlib/ocean/impulse_wars/*-release/
pufferlib/ocean/impulse_wars/debug-*/
pufferlib/ocean/impulse_wars/release-*/
pufferlib/ocean/impulse_wars/benchmark/

# dsym files
*.dSYM/
60 changes: 60 additions & 0 deletions pufferlib/config/ocean/overcooked.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
[base]
package = ocean
env_name = puffer_overcooked
policy_name = Policy
rnn_name = Recurrent

[env]
num_envs = 4096
num_agents = 2
layout = cramped_room
reward_dish_served_whole_team = 1.0
reward_dish_served_agent = 0.0
reward_pot_started = 0.15
reward_ingredient_added = 0.15
reward_ingredient_picked = 0.05
reward_plate_picked = 0.05
reward_soup_plated = 0.20
reward_wrong_dish_served = 0.0
reward_step_penalty = 0.0

[train]
total_timesteps = 100_000_000
learning_rate = 0.01
minibatch_size = 32768
gamma = 0.99
ent_coef = 0.02
gae_lambda = 0.97
clip_coef = 0.15
anneal_lr = True

[sweep]
method = Protein
metric = n
goal = maximize
downsample = 1

[sweep.train.learning_rate]
type = log_normal
min = 0.0001
max = 0.01

[sweep.train.ent_coef]
type = log_normal
min = 0.01
max = 0.30

[sweep.train.clip_coef]
type = log_normal
min = 0.05
max = 0.30

[sweep.train.gamma]
type = logit_normal
min = 0.90
max = 0.999

[sweep.train.gae_lambda]
type = logit_normal
min = 0.90
max = 0.999
1 change: 1 addition & 0 deletions pufferlib/ocean/environment.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ def make_multiagent(buf=None, **kwargs):
'checkers': 'Checkers',
'asteroids': 'Asteroids',
'whisker_racer': 'WhiskerRacer',
'overcooked': 'Overcooked',
'onestateworld': 'World',
'onlyfish': 'OnlyFish',
'chain_mdp': 'Chain',
Expand Down
247 changes: 247 additions & 0 deletions pufferlib/ocean/overcooked/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
# Overcooked Environment

A multi-agent cooking coordination environment where agents cooperate to prepare and serve onion soup. Based on the popular Overcooked video game, this environment tests agents' ability to coordinate, divide labor, and work together efficiently.

## File Structure

```
overcooked/
├── overcooked.h # Main entry point (init, reset, step, close)
├── overcooked_types.h # Constants, enums, and struct definitions
├── overcooked_items.h # Item and cooking pot management
├── overcooked_obs.h # Observation computation
├── overcooked_logic.h # Game logic (interaction, movement, cooking)
├── overcooked_render.h # Rendering and texture management
├── binding.c # Python bindings
└── overcooked.py # Python environment wrapper
```

## Observation Space

**39-dimensional vector per agent** — *see [compute_observations](overcooked_obs.h#L81)*

### Player Features (34 dims)
- **Orientation** (4): One-hot encoding of facing direction — [overcooked_obs.h:101-103](overcooked_obs.h#L101-L103)
- **Held Object** (4): One-hot encoding (onion, plated_soup, plate, empty) — [overcooked_obs.h:105-116](overcooked_obs.h#L105-L116)
- **Proximity Features** (12): Normalized (dx, dy) to nearest — [overcooked_obs.h:118-167](overcooked_obs.h#L118-L167):
- Onion source (ingredient box)
- Dish source (plate box)
- Plated soup on counter
- Serving area
- Empty counter
- Pot (stove)
- **Nearest Soup Ingredients** (2): Onion/tomato counts in nearest plated soup or held soup (normalized) — [overcooked_obs.h:169-179](overcooked_obs.h#L169-L179)
- **Pot Soup Ingredients** (2): Onion/tomato counts in nearest pot (normalized) — [overcooked_obs.h:181-202](overcooked_obs.h#L181-L202)
- **Pot Existence** (1): Binary flag for reachable pot — [overcooked_obs.h:205](overcooked_obs.h#L205)
- **Pot State** (4): Binary flags (empty, full, cooking, ready) — [overcooked_obs.h:207-215](overcooked_obs.h#L207-L215)
- **Cooking Time** (1): Remaining cook time (normalized) — [overcooked_obs.h:217-223](overcooked_obs.h#L217-L223)
- **Wall Detection** (4): Binary flags for walls/obstacles (up, down, left, right) — [overcooked_obs.h:225-235](overcooked_obs.h#L225-L235)

### Spatial Features (4 dims)
- **Teammate Relative Position** (2): Normalized (dx, dy) to other agent — [overcooked_obs.h:237-248](overcooked_obs.h#L237-L248)
- **Absolute Position** (2): Normalized (x, y) coordinates — [overcooked_obs.h:250-252](overcooked_obs.h#L250-L252)

### Context (1 dim)
- **Reward** (1): Current step reward — [overcooked_obs.h:255](overcooked_obs.h#L255)

## Action Space

**6 discrete actions** — *see [c_step](overcooked.h#L77)*
- 0: No-op — [ACTION_NOOP](overcooked_types.h#L43)
- 1: Move up — [ACTION_UP](overcooked_types.h#L44)
- 2: Move down — [ACTION_DOWN](overcooked_types.h#L45)
- 3: Move left — [ACTION_LEFT](overcooked_types.h#L46)
- 4: Move right — [ACTION_RIGHT](overcooked_types.h#L47)
- 5: Interact (pick up/place items, use equipment) — [ACTION_INTERACT](overcooked_types.h#L48)

## Reward System

*See [evaluate_dish_served](overcooked_logic.h#L229) and [handle_interaction](overcooked_logic.h#L106)*

### Main Rewards
- **Correct dish served** (3 onions): +1.0 (shared), +0.0 (server bonus) — [overcooked_logic.h:237-241](overcooked_logic.h#L237-L241)
- **Wrong dish served** (incorrect recipe): +0.0 (shared) — [overcooked_logic.h:252-258](overcooked_logic.h#L252-L258)
- **Step penalty**: 0.0 — [overcooked.h:80](overcooked.h#L80)

### Intermediate Rewards
- **Pick up ingredient**: +0.05 — [overcooked_logic.h:221](overcooked_logic.h#L221)
- **Add onion to pot**: +0.15 — [overcooked_logic.h:133](overcooked_logic.h#L133)
- **Start cooking** (3 onions in pot): +0.15 — [overcooked_logic.h:145-147](overcooked_logic.h#L145-L147)
- **Plate cooked soup**: +0.20 — [overcooked_logic.h:159](overcooked_logic.h#L159)

## Recipe

The correct recipe requires **exactly 3 onions** in the soup. Agents must:
1. Pick up onions from ingredient boxes
2. Add 3 onions to a pot
3. Start cooking (interact with pot when empty-handed)
4. Wait for soup to cook (20 steps)
5. Pick up a plate from plate box
6. Plate the cooked soup (interact with pot while holding plate)
7. Deliver plated soup to serving area

## Configuration

*See [Overcooked class](overcooked.py#L14)*

```python
env = Overcooked(
num_envs=1, # Number of parallel environments
layout="cramped_room", # Layout name (see Available Layouts)
num_agents=2, # Agents per environment
render_mode=None, # Set to enable rendering
log_interval=128, # Steps between log aggregation
grid_size=32, # Render tile size in pixels

# Reward configuration (from config/ocean/overcooked.ini)
reward_dish_served_whole_team=1.0, # Shared reward for correct dish
reward_dish_served_agent=0.0, # Bonus for serving agent
reward_pot_started=0.15, # Starting correct recipe
reward_ingredient_added=0.15, # Adding onion to pot
reward_ingredient_picked=0.05, # Picking up ingredient
reward_soup_plated=0.20, # Plating cooked soup
reward_wrong_dish_served=0.0, # Serving incorrect dish
reward_step_penalty=0.0, # Per-step penalty
)
```

## Game Constants

- **Cooking time**: 20 steps — [COOKING_TIME](overcooked_types.h#L39)
- **Max ingredients per pot**: 3 — [MAX_INGREDIENTS](overcooked_types.h#L40)
- **Max episode steps**: 400 (default)
- **Max dynamic items**: 20 — [overcooked.h:19](overcooked.h#L19)

## Available Layouts

*See [LAYOUTS](overcooked_types.h#L244-L259)*

### cramped_room (5x5)

```
+---+---+---+---+---+
| W | C | P | C | W | W = Wall
+---+---+---+---+---+ C = Counter
| I | | | | I | P = Pot (Stove)
+---+---+---+---+---+ I = Ingredient Box (Onions)
| C | | | | C | D = Dish/Plate Box
+---+---+---+---+---+ S = Serving Area
| C | | | | C |
+---+---+---+---+---+
| W | D | C | S | W |
+---+---+---+---+---+
```
Spawns: (1,2) and (3,2)

### asymmetric_advantages (9x5)

```
+---+---+---+---+---+---+---+---+---+
| W | C | W | W | W | W | W | C | W |
+---+---+---+---+---+---+---+---+---+
| I | | C | S | W | I | C | | S |
+---+---+---+---+---+---+---+---+---+
| C | | | | P | | | | C |
+---+---+---+---+---+---+---+---+---+
| C | | | | P | | | | C |
+---+---+---+---+---+---+---+---+---+
| W | C | C | D | W | D | C | C | W |
+---+---+---+---+---+---+---+---+---+
```
Spawns: (1,2) and (7,2)

### forced_coordination (5x5)

```
+---+---+---+---+---+
| W | C | W | P | W | W = Wall
+---+---+---+---+---+ C = Counter
| I | | C | | P | P = Pot (Stove)
+---+---+---+---+---+ I = Ingredient Box (Onions)
| I | | C | | C | D = Dish/Plate Box
+---+---+---+---+---+ S = Serving Area
| D | | C | | C |
+---+---+---+---+---+
| W | C | W | S | W |
+---+---+---+---+---+
```
Spawns: (1,2) and (3,2)

A challenging layout with a center wall dividing the kitchen. Agents must coordinate through limited passage points.

### coordination_ring (5x5)

```
+---+---+---+---+---+
| W | C | C | P | W | W = Wall
+---+---+---+---+---+ C = Counter
| C | | | | P | P = Pot (Stove)
+---+---+---+---+---+ I = Ingredient Box (Onions)
| D | | C | | C | D = Dish/Plate Box
+---+---+---+---+---+ S = Serving Area
| I | | | | C |
+---+---+---+---+---+
| W | I | S | C | W |
+---+---+---+---+---+
```
Spawns: (1,2) and (3,2)

Ring-shaped layout with a center counter obstacle. Agents must navigate around the center to coordinate ingredient pickup and soup delivery.

### counter_circuit (8x5)

```
+---+---+---+---+---+---+---+---+
| W | C | C | P | P | C | C | W |
+---+---+---+---+---+---+---+---+
| C | | | | | | | C |
+---+---+---+---+---+---+---+---+
| D | | C | C | C | C | | S |
+---+---+---+---+---+---+---+---+
| C | | | | | | | C |
+---+---+---+---+---+---+---+---+
| W | C | C | I | I | C | C | W |
+---+---+---+---+---+---+---+---+
```
Spawns: (1,1) and (6,3)

Circuit-shaped layout with a center counter island. Agents must coordinate around the obstacle to efficiently transport ingredients and serve dishes. Features dual pots and dual ingredient boxes for parallel cooking.

## Logging Metrics

*See [Log struct](overcooked_types.h#L65-L78)*

| Metric | Description |
|--------|-------------|
| perf | Normalized performance (correct dishes served) |
| score | Raw score (correct dishes served) |
| episode_return | Sum of rewards over episode |
| episode_length | Number of steps in episode |
| dishes_served | Total dishes served (correct + wrong) |
| correct_dishes | Number of 3-onion dishes served |
| wrong_dishes | Number of incorrect dishes served |
| ingredients_picked | Total ingredients picked up |
| pots_started | Number of cooking sessions started |
| items_dropped | Number of items placed on counters |
| agent_collisions | Number of agent collision attempts |

## Agent Reset Mechanism

If an agent goes 512 steps without receiving a reward, it is automatically reset to its starting position with no held item. This prevents agents from getting stuck — [c_step](overcooked.h#L114-L133)

## Building

```bash
# Build the environment
python setup.py build_overcooked --inplace

# Run standalone test
python pufferlib/ocean/overcooked/overcooked.py

# Run standalone demo with specific layout
./overcooked cramped_room
./overcooked asymmetric_advantages
./overcooked forced_coordination
./overcooked coordination_ring
./overcooked counter_circuit
```
37 changes: 37 additions & 0 deletions pufferlib/ocean/overcooked/binding.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#include "overcooked.h"

#define Env Overcooked
#include "../env_binding.h"

static int my_init(Env* env, PyObject* args, PyObject* kwargs) {
env->layout_id = (LayoutType)unpack(kwargs, "layout");
env->num_agents = unpack(kwargs, "num_agents");
env->grid_size = unpack(kwargs, "grid_size");
env->observation_size = unpack(kwargs, "observation_size");
env->rewards_config.dish_served_whole_team = unpack(kwargs, "reward_dish_served_whole_team");
env->rewards_config.dish_served_agent = unpack(kwargs, "reward_dish_served_agent");
env->rewards_config.pot_started = unpack(kwargs, "reward_pot_started");
env->rewards_config.ingredient_added = unpack(kwargs, "reward_ingredient_added");
env->rewards_config.ingredient_picked = unpack(kwargs, "reward_ingredient_picked");
env->rewards_config.plate_picked = unpack(kwargs, "reward_plate_picked");
env->rewards_config.soup_plated = unpack(kwargs, "reward_soup_plated");
env->rewards_config.wrong_dish_served = unpack(kwargs, "reward_wrong_dish_served");
env->rewards_config.step_penalty = unpack(kwargs, "reward_step_penalty");
init(env);
return 0;
}

static int my_log(PyObject* dict, Log* log) {
assign_to_dict(dict, "perf", log->perf);
assign_to_dict(dict, "score", log->score);
assign_to_dict(dict, "episode_return", log->episode_return);
assign_to_dict(dict, "episode_length", log->episode_length);
assign_to_dict(dict, "dishes_served", log->dishes_served);
assign_to_dict(dict, "correct_dishes", log->correct_dishes);
assign_to_dict(dict, "wrong_dishes", log->wrong_dishes);
assign_to_dict(dict, "ingredients_picked", log->ingredients_picked);
assign_to_dict(dict, "pots_started", log->pots_started);
assign_to_dict(dict, "items_dropped", log->items_dropped);
assign_to_dict(dict, "agent_collisions", log->agent_collisions);
return 0;
}
Loading