光华讲坛——社会名流与企业家论坛第6246期
主 题:Gaussian Differential Privacy and Some Computational Challenges
主讲人:宾夕法尼亚大学苏炜杰副教授
主持人:统计学院陈雪蓉副教授
时间:2022年10月26日(周三)上午9:30-10:30
直播平台及会议ID:腾讯会议,174-303-444
主办单位:统计研究中心和统计学院 科研处
主讲人简介:
Weijie Su is an Associate Professor in the Wharton Statistics and Data Science Department and, by courtesy, in the Department of Computer and Information Science, at the University of Pennsylvania. He is a co-director of Penn Research in Machine Learning. Prior to joining Penn, he received his Ph.D. from Stanford University in 2016 and his bachelor’s degree from Peking University in 2011. His research interests span privacy-preserving data analysis, optimization, high-dimensional statistics, and deep learning theory. He is a recipient of the Stanford Theodore Anderson Dissertation Award in 2016, an NSF CAREER Award in 2019, an Alfred Sloan Research Fellowship in 2020, the SIAM Early Career Prize in Data Science in 2022, and the IMS Peter Gavin Hall Prize in 2022
苏炜杰(Weijie Su)是宾夕法尼亚大学沃顿统计与数据科学系以及计算机与信息科学系副教授。他是宾夕法尼亚大学机器学习研究中心的co-director。在加入宾大之前,2016年他从斯坦福大学获得博士学位,2011年从北京大学获得学士学位。他的研究兴趣涵盖隐私保护数据分析、优化、高维统计和深度学习理论。2016年他荣获Stanford Theodore Anderson Dissertation Award、2019年荣获美国NSF CAREER Award、2020年荣获Alfred Sloan Research Fellowship、2022年荣获SIAM Early Career Prize in Data Science和IMS Peter Gavin Hall Prize。
内容提要:
Privacy-preserving data analysis has been put on a firm mathematical foundation since the introduction of differential privacy (DP) in 2006. This privacy definition, however, has some well-known weaknesses: notably, it does not tightly handle composition. In this talk, we propose a relaxation of DP that we term "f-DP", which has a number of appealing properties and avoids some of the difficulties associated with prior relaxations. This relaxation allows for lossless reasoning about composition and post-processing, and notably, a direct way to analyze privacy amplification by subsampling. We define a canonical single-parameter family of definitions within our class that is termed "Gaussian Differential Privacy", based on hypothesis testing of two shifted normal distributions. We prove that this family is focal to f-DP by introducing a central limit theorem, which shows that the privacy guarantees of any hypothesis-testing based definition of privacy converge to Gaussian differential privacy in the limit under composition. From a non-asymptotic standpoint, we introduce the Edgeworth Accountant, an analytical approach to compose $f$-DP guarantees of private algorithms. Finally, we demonstrate the use of the tools we develop by giving an improved analysis of the privacy guarantees of noisy stochastic gradient descent.
自2006年差分隐私(DP)引入以来,隐私保护数据分析已经建立在坚实的数学基础上。然而,这个隐私定义有一些众所周知的弱点:值得注意的是,它没有严格处理组合。在这次演讲中,我们提出了一种DP的松弛,我们称之为“f-DP”,它有许多吸引人的特性,并避免了一些与先前松弛相关的困难。这种放松允许无损推理的组成和后处理,值得注意的是,通过子采样直接分析隐私放大。我们在我们的类中定义了一个标准的单参数定义家族,称为“高斯差分隐私”,基于两个移位的正态分布的假设检验。我们通过引入一个中心极限定理证明了这个族是f-DP的焦点,证明了任何基于假设检验的隐私定义的隐私保证在复合极限下收敛于高斯微分隐私。从非渐近的角度,我们引入了Edgeworth会计,一种组成私有算法$f$-DP保证的分析方法。最后,我们通过给出噪声随机梯度下降的隐私保证的改进分析来演示我们开发的工具的使用。