主 题:Multiple Testing of Linear Forms for Noisy Matrix Completion噪声矩阵补全线性形式的多重检验
主讲人:香港城市大学杜理论副教授
主持人:统计学院林华珍教授
时间:2025年4月7日上午11:00-11:45
地点:柳林校区弘远楼408会议室
主办单位:统计研究中心和统计学院 科研处
主讲人简介:
Dr. Lilun DU is an associate professor of department of Decision Analytics and Operations, College of Business, City University of Hong Kong (CityUHK). He received his PhD in Statistics from University of Wisconsin-Madison in Year 2015. He is currently the program director of Master of Business and Data Analytics (Quantitative Analysis for Business) at CityUHK. He has broad research interests in multiple hypotheses testing, empirical Bayes method, and econometrics. Dr. DU also works in Operation Analytics and has derived data-driven tools for solving problems in high-volume recruitment. His work has been published in Annals of Statistics, Journal of American Statistical Association, Manufacturing & Services Operations Management, and Operations Research.
杜理论博士(Dr. Lilun DU)是香港城市大学(CityUHK)商学院决策分析与运营系副教授。他于2015年获得威斯康星大学麦迪逊分校统计学博士学位。他目前是香港城市大学商业与数据分析硕士(商业定量分析)项目主任。主要研究方向为多重假设检验、经验贝叶斯方法、计量经济学等。此外,他还从事运营分析研究,并开发了数据驱动工具来解决大规模招聘中的问题。 他的研究成果发表在《Annals of Statistics》《Journal of the American Statistical Association》《Manufacturing & Service Operations Management》和《Operations Research》等顶级学术期刊上。
内容提要:
Many important tasks of large-scale recommender systems can be naturally cast as testing multiple linear forms for noisy matrix completion. These problems, however, present unique challenges because of the subtle bias-and-variance tradeoff of and an intricate dependence among the estimated entries induced by the low-rank structure. In this paper, we develop a general approach to overcome these difficulties by introducing new statistics for individual tests with sharp asymptotics both marginally and jointly, and utilizing them to control the false discovery rate (FDR) via a data splitting and symmetric aggregation scheme. We show that valid FDR control can be achieved with guaranteed power under nearly optimal sample size requirements using the proposed methodology. Extensive numerical simulations and real data examples are also presented to further illustrate its practical merits. This is a joint work with Wanteng MA (Upenn), Dong XIA (HKUST), and Ming YUAN (Columbia U).
大规模推荐系统中的许多重要任务可以自然地表述为对噪声矩阵补全中的多个线性形式进行假设检验。然而,这类问题面临独特的挑战,主要源于低秩结构所引起的微妙的偏差-方差权衡以及估计矩阵条目之间的复杂依赖性。
在本文中,主讲人将提出一种通用方法来克服这些困难。具体而言,主讲人引入新的统计量,使得单个检验在边际和联合分布下都具有精确的渐近性质,并基于此采用数据拆分与对称聚合策略来控制错误发现率(FDR)。研究表明,该方法能够在接近最优的样本量要求下,实现具有保证检验能力的有效FDR控制。此外,主讲人通过大量数值模拟和真实数据实验,进一步展示了该方法的实际优势。
本研究为与宾夕法尼亚大学的Wanteng MA、香港科技大学的Dong XIA以及哥伦比亚大学的Ming YUAN合作完成。