Reinforcement learning, in particular, is based on the idea of learning through exploration, in other words: trial and error. However, trying out different options in an environment without any restrictions can be inherently risky. The agent might try behaviors that lead to catastrophic outcomes from which recovery or further learning is impossible. While this is not necessarily a problem in simulated environments, it becomes a more challenging issue if we would like these systems to someday work well in the real world. For example, a factory robot can not just randomly try out actions but has to make sure that the options tried do not pose any danger to humans working alongside such systems.
The goal of this project is to develop a system that provides agents with an “instinct”, developed through a much longer evolutionary timescale, that will guide agents away from unsafe exploration, facilitating life-long learning.