光华讲坛——社会名流与企业家论坛第 5942 期
主 题: Optimization, Generalization, and Out-of-distribution Prediction in Deep Learning
主讲人: 微软亚洲研究院高级研究员陈薇博士
直播平台及会议ID:腾讯会议,ID: 848 837 931
主办单位:统计研究中心和统计学院 科研处
Wei Chen is a principal research manager in Microsoft Research Asia. She is leading the Computing and Learning Theory Group, to push the frontiers of learning and computing in AI through theoretical understandings. Her current research interests include deep learning theory and algorithms, distributed machine learning, and trustworthy machine learning. She published 50 papers on top ML/AI conferences, including ICML, NeurIPS, and ICLR, etc.. Before joined Microsoft, she obtained her Ph.D. in Mathematics from Chinese Academy of Sciences. Now, she is an adjunct professor in the University of Science and Technology of China, Beijing Jiaotong University, and Nankai University.
陈薇,微软亚洲研究院高级研究员、计算学习理论研究组负责人,中国科学技术大学兼职博士生导师,南开大学、北京交通大学兼职教授。2006年毕业于山东大学数学科学学院统计系,2011年于中国科学院数学与系统科学研究院取得博士学位。她长期从事机器学习的基础研究,研究兴趣包括深度学习理论和算法、分布式机器学习、和可信机器学习等。在机器学习与人工智能顶级国际会议(如ICML, NeurIPS,和ICLR等)发表学术论文50余篇,合著学术专著1本,并担任多个机器学习国际会议的领域主席或者高级程序委员。
With big data and big computation, deep learning has achieved breakthrough in computer vision, natural language computing, speech, etc. At the same time, researchers are thinking about how to alleviate hyper-parameter tuning efforts, understanding why DNN can generalize well so far, and investigating how to make deep learning do better in out-of-distribution prediction. In this talk, I will introduce our recent research about the optimization, generalization, and o.o.d. prediction in deep learning. Firstly, I will present a new group-invariant optimization framework for ReLU neural networks, in which the positive-scaling redundancy can be removed; then, I will present our work about the implicit bias of the widely-used stochastic optimization algorithms in deep learning; finally, I will talk about how to improve out-of-distribution prediction by incorporating “causal” invariance.