itarticle reinforcement learning chapter 3 An Example — Recycling Robot Goals and Rewards Returns and Episodes Policies and Value Functions Optimal Policies and Optimal Value Functions 0