Tree Generation for Reinforcement Learning

Teaching AI to Plan Ahead

Reinforcement learning agents do stuff and are rewarded accordingly. If these rewards are set up well, it teaches an agent what to do and what not to do (kind of like training a dog). However, in complex or difficult tasks, it can be difficult to the agent to figure out in the moment what actions lead to what rewards.

This paper I wrote proposes a way for agents to plan ahead several steps and distinguish what a good sequence of actions looks like compared to a bad sequence of actions. Agents demonstrably perform better in various situations and on various tasks when they try to plan multiple steps into the future.

Teaching AI to Plan Ahead

Starcraft AI