A collection of random terminologies related to reinforcement learning.

Optimism in the Face of Uncertainty

OFU chooses actions assuming that the environment is as nice as possible. This works because if the optimism is justified then the agent was acting optimally, and, if the optimism is not justified, then the agent learns a better approximation of the true payoff as it gains more experience.