Chapter 13
arash haratian
4/23/2021
Short corridor with switched actions
source("./short_corridor.R")
Value of the first state over the probability of right action
example13.1()
Figure 13.1: REINFORCE
plot_fig13.1()
Figure 13.2: Adding a baseline to REINFORCE can make it learn much faster
plot_fig13.2()