Open sourcing TRFL: a library of reinforcement learning building blocks

Today we are open sourcing a new library of useful building blocks for writing reinforcement learning (RL) agents in TensorFlow. Named TRFL (pronounced truffle), it represents a collection of key algorithmic components that we have used internally for a large number of our most successful agents such as DQN, DDPG and the Importance Weighted Actor Learner Architecture.A typical deep reinforcement learning agent consists of a large number of interacting components: at the very least, these include the environment and some deep network representing values or policies, but they often also include components such as a learned model of the environment, pseudo-reward functions or a replay system.These parts tend to interact in subtle ways (often not well-documented in papers, as highlighted by Henderson and colleagues), thus making it difficult to identify bugs in such large computational graphs. A recent blog post by OpenAI highlighted this issue by analysing some of the most popular open-source implementations of reinforcement learning agents and finding that six out of 10 had subtle bugs found by a community member and confirmed by the author.One approach to addressing this issue, and helping those in the research community attempting to reproduce results from papers, is through open-sourcing complete agent implementations.