北京师范大学郭旭教授：Test and Measure for Partial Mean Dependence Based on Machine Learning Methods基于机器学习方法的偏均值依赖检验与度量-统计研究中心

统计研究中心

当前位置：首页 > 系列讲座 > 正文

北京师范大学郭旭教授：Test and Measure for Partial Mean Dependence Based on Machine Learning Methods基于机器学习方法的偏均值依赖检验与度量

光华讲坛——社会名流与企业家论坛第期

主题：Test and Measure for Partial Mean Dependence Based on Machine Learning Methods基于机器学习方法的偏均值依赖检验与度量

主讲人：北京师范大学郭旭教授

主持人：统计学院刘耀午教授

时间：2023年11月17日（周五）下午15:00-16:00

举办地点：柳林校区弘远楼408会议室

主办单位：统计研究中心和统计学院科研处

主讲人简介：

郭旭博士，现为北京师范大学统计学院教授，博士生导师。郭老师一直从事回归分析中复杂假设检验的理论方法及应用研究，近年来旨在对高维数据发展适当有效的检验方法。部分成果发表在JRSSB, JASA，Biometrika和JOE。担任《应用概率统计》杂志第十届编委。现主持国家自然科学基金优秀青年基金。曾荣获北师大第十一届“最受本科生欢迎的十佳教师”和北师大第十八届青教赛一等奖。还参加了北京市第十三届青教赛。

内容简介：

It is of importance to investigate the significance of a subset of covariates $W$ for the response $Y$ given covariates $Z$ in regression modeling. To this end, we propose a significance test for the partial mean independence problem based on machine learning methods and data splitting. The test statistic converges to the standard chi-squared distribution under the null hypothesis while it converges to a normal distribution under the fixed alternative hypothesis. Power enhancement and algorithm stability are also discussed. If the null hypothesis is rejected, we propose a partial Generalized Measure of Correlation (pGMC) to measure the partial mean dependence of $Y$ given $W$ after controlling for the nonlinear effect of $Z$. We present the appealing theoretical properties of the pGMC and establish the asymptotic normality of its estimator with the optimal root-$N$ convergence rate. Furthermore, the valid confidence interval for the pGMC is also derived. As an important special case when there are no conditional covariates $Z$, we introduce a new test of overall significance of covariates for the response in a model-free setting. Numerical studies and real data analysis are also conducted to compare with existing approaches and to demonstrate the validity and flexibility of our proposed procedures.

在回归建模中，研究一组协变量W对于给定协变量Z的响应Y的意义是很重要的。为此，主讲人提出了一种基于机器学习方法和数据分割的偏均值独立问题的显著性检验方法。检验统计量在零假设下收敛于标准卡方分布，而在固定备择假设下收敛于正态分布。讨论了功率增强和算法稳定性。如果零假设被拒绝，主讲人提出了一个偏广义相关测度(pGMC)来测量给定$W$的$Y$在控制$Z$的非线性效应后的偏均值依赖性。主讲人给出了pGMC的吸引人的理论性质，并建立了其估计量的渐近正态性与最优根-$N$收敛速率。此外，还推导了pGMC的有效置信区间。作为一个重要的特例，当没有条件协变量Z时，主讲人引入了一种新的协变量总体显著性检验，用于无模型设置下的响应。数值研究和实际数据分析也进行了比较，与现有的方法，并证明了我们提出的程序的有效性和灵活性。

上一条：中山大学蒋智超教授：An instrumental variable method for point processes点过程的工具变量法

下一条：香港理工大学赵兴球教授：Deep Nonparametric Inference for Conditional Hazard Function条件风险函数的深度非参数推断