diff --git a/ch18/README.md b/ch18/README.md new file mode 100644 index 00000000..65f6045e --- /dev/null +++ b/ch18/README.md @@ -0,0 +1,78 @@ +Python Machine Learning - Code Examples + + +## Chapter 18: Reinforcement Learning for Decision Making in Complex Environments + + +### Chapter Outline + +- Introduction: learning from experience + - Understanding reinforcement learning + - Defining the agent-environment interface of a reinforcement learning system + - The theoretical foundations of RL + - Markov decision processes + - The mathematical formulation of Markov decision processes + - Visualization of a Markov process + - Episodic versus continuing tasks + - RL terminology: return, policy, and value function + - The return + - Policy + - Value function + - Dynamic programming using the Bellman equation +- Reinforcement learning algorithms + - Dynamic programming + - Policy evaluation – predicting the value function with dynamic programmin + - Improving the policy using the estimated value function + - Policy iteration + - Value iteration + - Reinforcement learning with Monte Carlo + - State-value function estimation using MC + - Action-value function estimation using MC + - Finding an optimal policy using MC control + - Policy improvement – computing the greedy policy from the action-value function + - Temporal difference learning + - TD prediction + - On-policy TD control (SARSA) + - Off-policy TD control (Q-learning) +- Implementing our first RL algorithm + - Introducing the OpenAI Gym toolkit + - Working with the existing environments in OpenAI Gym + - A grid world example + - Implementing the grid world environment in OpenAI Gym + - Solving the grid world problem with Q-learning + - Implementing the Q-learning algorithm +- A glance at deep Q-learning + - Training a DQN model according to the Q-learning algorithm + - Replay memory + - Determining the target values for computing the loss + - Implementing a deep Q-learning algorithm +- Chapter and book summary + +### A note on using the code examples + +The recommended way to interact with the code examples in this book is via Jupyter Notebook (the `.ipynb` files). Using Jupyter Notebook, you will be able to execute the code step by step and have all the resulting outputs (including plots and images) all in one convenient document. + +![](../ch02/images/jupyter-example-1.png) + + + +Setting up Jupyter Notebook is really easy: if you are using the Anaconda Python distribution, all you need to install jupyter notebook is to execute the following command in your terminal: + + conda install jupyter notebook + +Then you can launch jupyter notebook by executing + + jupyter notebook + +A window will open up in your browser, which you can then use to navigate to the target directory that contains the `.ipynb` file you wish to open. + +**More installation and setup instructions can be found in the [README.md file of Chapter 1](../ch01/README.md)**. + +**(Even if you decide not to install Jupyter Notebook, note that you can also view the notebook files on GitHub by simply clicking on them: [`ch18.ipynb`](ch18.ipynb))** + +In addition to the code examples, I added a table of contents to each Jupyter notebook as well as section headers that are consistent with the content of the book. Also, I included the original images and figures in hope that these make it easier to navigate and work with the code interactively as you are reading the book. + +![](../ch02/images/jupyter-example-2.png) + + +When I was creating these notebooks, I was hoping to make your reading (and coding) experience as convenient as possible! However, if you don't wish to use Jupyter Notebooks, I also converted these notebooks to regular Python script files (`.py` files) that can be viewed and edited in any plaintext editor.