微软亚洲研究院高级研究员陈薇博士：Optimization, Generalization, and Out-of-distribution Prediction in Deep Learning-统计研究中心

统计研究中心

当前位置：首页 > 系列讲座 > 正文

微软亚洲研究院高级研究员陈薇博士：Optimization, Generalization, and Out-of-distribution Prediction in Deep Learning

光华讲坛——社会名流与企业家论坛第 5942 期

主题： Optimization, Generalization, and Out-of-distribution Prediction in Deep Learning

主讲人：微软亚洲研究院高级研究员陈薇博士

主持人：统计学院林华珍教授

时间：2021年11月24日（周三）上午9：30-10：30

直播平台及会议ID：腾讯会议，ID: 848 837 931

主办单位：统计研究中心和统计学院科研处

主讲人简介：

Wei Chen is a principal research manager in Microsoft Research Asia. She is leading the Computing and Learning Theory Group, to push the frontiers of learning and computing in AI through theoretical understandings. Her current research interests include deep learning theory and algorithms, distributed machine learning, and trustworthy machine learning. She published 50 papers on top ML/AI conferences, including ICML, NeurIPS, and ICLR, etc.. Before joined Microsoft, she obtained her Ph.D. in Mathematics from Chinese Academy of Sciences. Now, she is an adjunct professor in the University of Science and Technology of China, Beijing Jiaotong University, and Nankai University.

陈薇，微软亚洲研究院高级研究员、计算学习理论研究组负责人，中国科学技术大学兼职博士生导师，南开大学、北京交通大学兼职教授。2006年毕业于山东大学数学科学学院统计系，2011年于中国科学院数学与系统科学研究院取得博士学位。她长期从事机器学习的基础研究，研究兴趣包括深度学习理论和算法、分布式机器学习、和可信机器学习等。在机器学习与人工智能顶级国际会议（如ICML, NeurIPS，和ICLR等）发表学术论文50余篇，合著学术专著1本，并担任多个机器学习国际会议的领域主席或者高级程序委员。

内容提要：

With big data and big computation, deep learning has achieved breakthrough in computer vision, natural language computing, speech, etc. At the same time, researchers are thinking about how to alleviate hyper-parameter tuning efforts, understanding why DNN can generalize well so far, and investigating how to make deep learning do better in out-of-distribution prediction. In this talk, I will introduce our recent research about the optimization, generalization, and o.o.d. prediction in deep learning. Firstly, I will present a new group-invariant optimization framework for ReLU neural networks, in which the positive-scaling redundancy can be removed; then, I will present our work about the implicit bias of the widely-used stochastic optimization algorithms in deep learning; finally, I will talk about how to improve out-of-distribution prediction by incorporating “causal” invariance.

通过大数据和大计算，深度学习在计算机视觉、自然语言计算、语音等方面取得了突破。与此同时，研究人员正在思考如何减轻超参数调优的工作量，理解为什么DNN到目前为止具有较好的泛化能力，并研究如何使深度学习在非分布预测中发挥更好的作用。在本报告中，主讲人介绍他们最近在深度学习中的优化、泛化和o.o.d.预测方面的研究。首先，提出了一种新的群不变优化框架，该框架去除了ReLU神经网络的正尺度冗余;然后，主讲人将介绍他们关于深度学习中广泛使用的随机优化算法的内隐偏差的研究;最后，主讲人将讨论如何通过合并“因果”不变性来改进非分布预测。

上一条：宾州州立大学李润泽教授: Hypothesis Testing on Linear Structures of High Dimensional Covariance Matrix

下一条：美国加州大学河滨分校马舒洁教授：Determining the number of communities in degree-corrected stochastic block models