Skip to content

Latest commit

 

History

History

side_effects_penalties

Side effects penalties

This is the code for the paper Penalizing side effects using stepwise relative reachability by Krakovna et al (2019). It implements a tabular Q-learning agent with different penalties for side effects. Each side effects penalty consists of a deviation measure (none, unreachability, relative reachability, or attainable utility) and a baseline (starting state, inaction, or stepwise inaction).

Instructions

Clone the repository:

git clone https://github.com/deepmind/deepmind-research/side_effects_penalties.git

Running an agent with a side effects penalty

Run the agent with a given penalty on an AI Safety Gridworlds environment:

python -m side_effects_penalties.run_experiment -baseline <X> -dev_measure <Y> -env_name <Z> -suffix <S>

The following parameters can be specified for the side effects penalty:

  • Baseline state (-baseline): starting state (start), inaction (inaction), stepwise inaction with rollouts (stepwise), stepwise inaction without rollouts (step_noroll)
  • Deviation measure (-dev_measure): none (none), unreachability (reach), relative reachability (rel_reach), attainable utility (att_util)
  • Discount factor for the deviation measure value function (-value_discount)
  • Summary function to apply to the relative reachability or attainable utility deviation measure (-dev_fun): max (0, x) (truncation) or |x| (absolute)
  • Weight for the side effects penalty relative to the reward (-beta)

Other arguments:

  • AI Safety Gridworlds environment name (-env_name)
  • Number of episodes (-num_episodes)
  • Filename suffix for saving result files (-suffix)

Plotting the results

Make a summary data frame from the result files generated by run_experiment:

python -m side_effects_penalties.results_summary -compare_penalties -input_suffix <S>

Arguments:

  • -bar_plot: make a data frame for a bar plot (True) or learning curve plot (False)
  • -compare_penalties: compare different penalties using the best beta value for each penalty (True), or compare different beta values for a given penalty (False)
  • If compare_penalties=False, specify the penalty parameters (-dev_measure, -dev_fun and -value_discount)
  • Environment name (-env_name)
  • Filename suffix for loading result files (-input_suffix)
  • Filename suffix for the summary data frame (-output_suffix)

Import the summary data frame into plot_results.ipynb and make a bar plot or learning curve plot.

Dependencies

  • Python 2.7 or 3 (tested with Python 2.7.15 and 3.6.7)
  • AI Safety Gridworlds suite of safety environments
  • Abseil Python common libraries
  • Numpy
  • Pandas
  • Six
  • Matplotlib
  • Seaborn

Citing this work

If you use this code in your work, please cite the accompanying paper:

@article{srr2019, title = {Penalizing Side Effects using Stepwise Relative Reachability}, author = {Victoria Krakovna and Laurent Orseau and Ramana Kumar and Miljan Martic and Shane Legg}, journal = {CoRR}, volume = {abs/1806.01186}, year = {2019} }