Skip to content

[Question] MiniGrid Training Problem #500

@skyler-sky

Description

@skyler-sky

❓ Question

I installed stable-baseline3 and rl-baselines3-zoo. I install minigrid(3.0.0) with pip install minigrid.
I do the training with python train.py --algo ppo --env MiniGrid-Empty-Random-5x5-v0 --eval-freq 10000 --eval-episodes 10 --n-eval-envs 1
The config are

MiniGrid-Empty-Random-5x5-v0: &minigrid-defaults
env_wrapper: minigrid.wrappers.FlatObsWrapper # See GH/1320#issuecomment-1421108191
normalize: true
n_envs: 8 # number of environment copies running in parallel
n_timesteps: !!float 1e5
policy: 'MlpPolicy'
n_steps: 128 # batch size is n_steps * n_env
batch_size: 64 # Number of training minibatches per update
gae_lambda: 0.95 # Factor for trade-off of bias vs variance for Generalized Advantage Estimator
gamma: 0.99
n_epochs: 10 # Number of epoch when optimizing the surrogate
ent_coef: 0.0 # Entropy coefficient for the loss calculation
learning_rate: 2.5e-4 # The learning rate, it can be a function
clip_range: 0.2 # Clipping parameter, it can be a function

I dont change anything else. But the training curve is awkward:

Image

Anyone has thoughts why this happen?

Checklist

Metadata

Metadata

Assignees

No one assigned

    Labels

    Maintainers on vacationMaintainers are on vacation so they can recharge their batteries, we will be back soon ;)questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions