d214: Q-learning | AI:Mechanic

Q-learning is a model-free reinforcement learning technique. It can be used to find an optimal action-selection policy for any given Markov decision process (MDP)

Algorithm:

screen-shot-2016-09-15-at-10-49-46

Agent senses its environment, using this information to determine its current state

Agent takes an action and obtain a penalty or reward

Agent senses its environment again – to see what effect its chosen action had

Agent learns from its experience (and so makes ‘better’ decisions next time)

Source: How does Q-learning work?

Implementation:

Python: http://mnemstudio.org/path-finding-q-learning-tutorial.htm [Raw]

Links:

Awesome Reinforcement Learning
Reinforcement Learning
Monte Carlo Methods
Q-learning with Neural Networks

Share this:

Related