A class of reinforcement learning algorithms that make use of a learned or pre-defined model of the environment to optimize the policy or value function.
A class of reinforcement learning algorithms that make use of a learned or pre-defined model of the environment to optimize the policy or value function.