机器学习与数据科学博士生系列论坛(第六十期)—— Acceleration of Reinforcement Learning Facilitated by Decomposable Structures of MDPs
报告人:Hao Jin (PKU)
时间:2023-10-26 16:00-17:00
地点:腾讯会议 551-1675-5419
摘要:
In many real-life applications of reinforcement learning, the learning agent is challenged with either large state space or large action space, which results in high sample complexity. Fortunately, such Markov Decision Processes (MDPs) are usually believed to have decomposable structures. Decomposability enables many distributed algorithms to efficiently solve the learning problems.
In this talk, we introduce various methods for efficiently solving large-scale MDPs. Specifically, we start from traditional techniques of decomposing large-scale MDPs and move on to modern methods of hierarchical reinforcement learning (HRL). Although original algorithms of HRL require a central 'manager' to coordinate 'workers' of subtasks, there is a recent trend to a fully distributed design without a central manger. These distributed HRL algorithms are theoretically shown to achieve lower sample complexity under assumptions on the decomposable structures of MDPs.
论坛简介:该线上论坛是由张志华教授机器学习实验室组织,每两周主办一次(除了公共假期)。论坛每次邀请一位博士生就某个前沿课题做较为系统深入的介绍,主题包括但不限于机器学习、高维统计学、运筹优化和理论计算机科学。