An implementation of q algorithm of Reinforcement Learning.
- Python 3
- TensorFlow 1.0.1
- pygame
- gym
git clone https://github.com/lufficc/dqn.git
cd dqn
python run.py
resize and using binary image:
decayed ε-greedy exploration, and when exploration, 0.95 probability to do nothing(because in flappy bird, most time wo do nothing). This is very important. It makes model converge in less than 2 hours.
def egreedy_action(self, state):
#Exploration
if random.random() <= self.epsilon:
if random.random() < 0.95:
action_index = 0
else:
action_index = 1
# action_index = random.randint(0, self.num_actions - 1)
else:
#Exploitation
action_index = self.action(state)
if self.epsilon > self.final_epsilon:
self.epsilon *= self.decay_factor
return action_index