Link Search Menu Expand Document

Wk 11. Safer Deep Reinforcement Learning

Lecture Date: October 30, 2023 - Monday
Lecturer: Dr. Bilal Kartal

Safe reinforcement learning has many variants, and it is still an open research problem. In this talk, I describe different auxiliary tasks that improve learning and focus on using action guidance through a non-expert demonstrator to avoid catastrophic events in a domain with sparse, delayed, and deceptive rewards: the previously proposed multi-agent benchmark of Pommerman. I present a framework where a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with a small number of rollouts, can be integrated into asynchronous distributed deep reinforcement learning methods. Compared to vanilla deep RL algorithms, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game and Atari games. I conclude the lecture with some recent examples of RL in large language models.

Slides:

See: BlackBoard -> Weekly Papers -> Week 11

Assigned Reading (choose one for your SWA 7):

  • Zheng, C., Yang, S., Parra-Ullauri, J. M., Garcia-Dominguez, A., & Bencomo, N. (2021). Reward-Reinforced Generative Adversarial Networks for Multi-Agent Systems. IEEE Transactions on Emerging Topics in Computational Intelligence.
  • Kartal, B., Hernandez-Leal, P., Gao, C., & Taylor, M. E. (2019). Safer deep rl with shallow mcts: A case study in pommerman. arXiv preprint arXiv:1904.05759.

Recommended Reading:

  • Browne, C. B., Powley, E., Whitehouse, D., Lucas, S. M., Cowling, P. I., Rohlfshagen, P., … & Colton, S. (2012). A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games, 4(1), 1-43.
  • Gao, C., Kartal, B., Hernandez-Leal, P., & Taylor, M. E. (2019, October). On hard exploration for reinforcement learning: A case study in pommerman. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 15, No. 1, pp. 24-30).
  • Garcıa, J., & Fernández, F. (2015). A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research, 16(1), 1437-1480.
  • Kartal, B., Hernandez-Leal, P., & Taylor, M. E. (2019, October). Action Guidance with MCTS for Deep Reinforcement Learning. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 15, No. 1, pp. 153-159).
  • Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press. Chapters 1-2. PDF version is available at http://incompleteideas.net/book/the-book.html

Back to top

Copyright © Hamdi Kavak. CSI 709/CSS 739 - Verification and Validation of Models.