Tomáš Kocák
Abstract:
Reinforcement Learning (RL), like Multi-Armed Bandits, is a popular paradigm for sequential decision-making under uncertainty. A typical RL algorithm operates with only limited knowledge of the changing environment and with limited feedback on the quality of the decisions. This talk serves as an introduction to RL. We explore usual approaches to formalize the problem, discuss possible solutions, and point out important theoretical results.