上海财经大学周帆副教授:Recent advances in Distributional Reinforcement Learning-统计研究中心

统计研究中心

当前位置：首页 > 正文

上海财经大学周帆副教授:Recent advances in Distributional Reinforcement Learning分布式强化学习的最新进展

主题：Recent advances in Distributional Reinforcement Learning分布式强化学习的最新进展

主讲人：上海财经大学周帆副教授

主持人：统计学院林华珍教授

时间：2024年1月22日（周一）下午15:00-16:00

举办地点：柳林校区弘远楼408会议室

主办单位：统计研究中心和统计学院科研处

主讲人简介：

周帆，上海财经大学统计与管理学院副教授，博士毕业于美国北卡罗莱纳大学教堂山分校。主要研究方向包括强化学习，深度学习，因果推断。在Journal of American Statistical Association，Journal of Machine Learning Research, Biometrics等统计学机器学习期刊以及NeurIPS, ICML, KDD等国际人工智能顶会接收发表一作通讯文章数十篇，曾获得国际泛华统计协会新研究者奖，北卡教堂山分校Barry H. Margolin Award，并入选上海市人才计划（青年）。

内容简介：

Although distributional reinforcement learning (DRL) has been widely examined in the past few years, very few studies investigate the validity of the obtained Q-function estimator in the distributional setting. We discuss some of our works in ensuring the monotonicity of the obtained quantile estimates and the theoretical necessity. Moreover, we undertake a comprehensive analysis of how the approximation errors within the Q-function impact the overall training process in DRL. We both theoretically analyze and empirically demonstrate techniques to reduce both bias and variance in these error terms, ultimately resulting in improved performance in practical applications.

尽管分布强化学习(DRL)在过去几年中得到了广泛的研究，但很少有研究调查在分布设置下得到的q函数估计量的有效性。主讲人讨论了在保证所得到的分位数估计的单调性方面所做的一些工作和理论上的必要性。此外，主讲人对q函数内的近似误差如何影响DRL的整体训练过程进行了全面分析。主讲人从理论上分析和经验上展示了减少这些误差项中的偏差和方差的技术，最终在实际应用中提高了性能。

上一条：上海财经大学贺莘副教授:Learning linear non-Gaussian directed acyclic graph: From single to multiple sources

下一条：上海财经大学邱怡轩教授:A Semi-smooth, Self-shifting, and Singular Newton Method for Sparse Optimal Transport