Added ch18/readme (rasbt#97)

* Added ch18/readme * Fixed the notebook links
dinesh1988 · Dec 1, 2019 · 5ec6b5c · 5ec6b5c
1 parent 0b64eff
commit 5ec6b5c
Showing 1 changed file with 78 additions and 0 deletions.
diff --git a/ch18/README.md b/ch18/README.md
@@ -0,0 +1,78 @@
+Python Machine Learning - Code Examples
+
+
+##  Chapter 18: Reinforcement Learning for Decision Making in Complex Environments
+
+
+### Chapter Outline
+
+- Introduction: learning from experience
+  - Understanding reinforcement learning
+  - Defining the agent-environment interface of a reinforcement learning system
+  - The theoretical foundations of RL
+    - Markov decision processes
+    - The mathematical formulation of Markov decision processes
+    - Visualization of a Markov process
+    - Episodic versus continuing tasks
+  - RL terminology: return, policy, and value function
+    - The return
+    - Policy
+    - Value function
+  - Dynamic programming using the Bellman equation
+- Reinforcement learning algorithms
+  - Dynamic programming
+    - Policy evaluation – predicting the value function with dynamic programmin
+    - Improving the policy using the estimated value function
+    - Policy iteration
+    - Value iteration
+  - Reinforcement learning with Monte Carlo
+    - State-value function estimation using MC
+    - Action-value function estimation using MC
+    - Finding an optimal policy using MC control
+    - Policy improvement – computing the greedy policy from the action-value function
+  - Temporal difference learning
+    - TD prediction
+    - On-policy TD control (SARSA)
+    - Off-policy TD control (Q-learning)
+- Implementing our first RL algorithm
+  - Introducing the OpenAI Gym toolkit
+    - Working with the existing environments in OpenAI Gym
+  - A grid world example
+    - Implementing the grid world environment in OpenAI Gym
+  - Solving the grid world problem with Q-learning
+    - Implementing the Q-learning algorithm
+- A glance at deep Q-learning
+  - Training a DQN model according to the Q-learning algorithm
+    - Replay memory
+    - Determining the target values for computing the loss
+  - Implementing a deep Q-learning algorithm
+- Chapter and book summary
+
+### A note on using the code examples
+
+The recommended way to interact with the code examples in this book is via Jupyter Notebook (the `.ipynb` files). Using Jupyter Notebook, you will be able to execute the code step by step and have all the resulting outputs (including plots and images) all in one convenient document.
+
+![](../ch02/images/jupyter-example-1.png)
+
+
+
+Setting up Jupyter Notebook is really easy: if you are using the Anaconda Python distribution, all you need to install jupyter notebook is to execute the following command in your terminal:
+
+    conda install jupyter notebook
+
+Then you can launch jupyter notebook by executing
+
+    jupyter notebook
+
+A window will open up in your browser, which you can then use to navigate to the target directory that contains the `.ipynb` file you wish to open.
+
+**More installation and setup instructions can be found in the [README.md file of Chapter 1](../ch01/README.md)**.
+
+**(Even if you decide not to install Jupyter Notebook, note that you can also view the notebook files on GitHub by simply clicking on them: [`ch18.ipynb`](ch18.ipynb))**
+
+In addition to the code examples, I added a table of contents to each Jupyter notebook as well as section headers that are consistent with the content of the book. Also, I included the original images and figures in hope that these make it easier to navigate and work with the code interactively as you are reading the book.
+
+![](../ch02/images/jupyter-example-2.png)
+
+
+When I was creating these notebooks, I was hoping to make your reading (and coding) experience as convenient as possible! However, if you don't wish to use Jupyter Notebooks, I also converted these notebooks to regular Python script files (`.py` files) that can be viewed and edited in any plaintext editor.