Official codebase for Reinforcement Learning with Augmented Data on Procgen Benchmark. This codebase was originally forked from Procgen. Official codebases for DM control and OpenAI Gym are available at RAD: DM control RAD: OpenAI Gym.
@article{laskin2020reinforcement,
title={Reinforcement learning with augmented data},
author={Laskin, Michael and Lee, Kimin and Stooke, Adam and Pinto, Lerrel and Abbeel, Pieter and Srinivas, Aravind},
journal={arXiv preprint arXiv:2004.14990},
year={2020}
}
You can get miniconda from https://docs.conda.io/en/latest/miniconda.html if you don't have it, or install the dependencies from environment.yml
manually.
git clone https://github.com/openai/train-procgen.git
conda env update --name train-procgen --file train-procgen/environment.yml
conda activate train-procgen
got to Procgen_Envs
pip install -e .
comback to Procgen
pip install https://github.com/openai/baselines/archive/9ee399f5b20cd70ac0a871927a6cf043b478193f.zip
pip install -e train-procgen
pip uninstall tensorflow
conda install -n train-procgen tensorflow-gpu=1.15 cudatoolkit=10.0
pip install torch matplotlib scikit-image
change NeurIPS2020_Procgen_Envs/procgen/src/game.h
for random crop
Pixel PPO on the environment StarPilot:
./scripts/train_normal.sh starpilot
PPO + RAD (crop) on the environment StarPilot:
./scripts/train.sh starpilot crop
PPO + RAD (flip) on the environment StarPilot:
./scripts/train.sh starpilot flip
PPO + RAD (color_jitter) on the environment StarPilot:
./scripts/train.sh starpilot color_jitter
PPO + RAD (rotate) on the environment StarPilot:
./scripts/train.sh starpilot rotate
PPO + RAD (cutout_color) on the environment StarPilot:
./scripts/train.sh starpilot cutout_color
PPO + RAD (cutout) on the environment StarPilot:
./scripts/train.sh starpilot cutout
PPO + RAD (gray) on the environment StarPilot:
./scripts/train.sh starpilot gray
PPO + RAD (random conv) on the environment StarPilot:
./scripts/train_random.sh starpilot