Lecture Date: October 19, 2020 - Monday
Lecturer: Dr. Bilal Kartal
Safe reinforcement learning has many variants, and it is still an open research problem. In this talk, I describe different auxiliary tasks that improve learning and focus on using action guidance through a non-expert demonstrator to avoid catastrophic events in a domain with sparse, delayed, and deceptive rewards: the previously proposed multi-agent benchmark of Pommerman. I present a framework where a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with a small number of rollouts, can be integrated into asynchronous distributed deep reinforcement learning methods. Compared to vanilla deep RL algorithms, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game and Atari games.
- Kartal, B., Hernandez-Leal, P., & Taylor, M. E. (2019, October). Action Guidance with MCTS for Deep Reinforcement Learning. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 15, No. 1, pp. 153-159).
- Kartal, B., Hernandez-Leal, P., Gao, C., & Taylor, M. E. (2019). Safer deep rl with shallow mcts: A case study in pommerman. arXiv preprint arXiv:1904.05759.
- Gao, C., Kartal, B., Hernandez-Leal, P., & Taylor, M. E. (2019, October). On hard exploration for reinforcement learning: A case study in pommerman. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 15, No. 1, pp. 24-30).
- Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press. Chapters 1-2. PDF version is available at http://incompleteideas.net/book/the-book.html
- Garcıa, J., & Fernández, F. (2015). A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research, 16(1), 1437-1480.
- Browne, C. B., Powley, E., Whitehouse, D., Lucas, S. M., Cowling, P. I., Rohlfshagen, P., … & Colton, S. (2012). A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games, 4(1), 1-43.