当前位置: 首页 > 首页 > 正文

美国伊利诺伊大学芝加哥分校杨敏教授:On Data Reduction of Big Data

光华讲坛——社会名流与企业家论坛第5170期

主题:On Data Reduction of Big Data

主讲人:美国伊利诺伊大学芝加哥分校杨敏教授

主持人:统计学院 林华珍教授

时间:2018年12月14日(星期五)上午10:30-11:30

地点:西南财经大学柳林校区弘远楼408会议室

主办单位:统计研究中心 统计学院 科研处

主讲人简介:

杨敏,其博士于2002年毕业于美国伊利诺伊大学芝加哥分校,现为美国伊利诺伊大学芝加哥分校教授,其研究方向包括实验设计,统计推断,抽样调查、纵向数据分析等。至今主持了美国自然科学研究基金项目5项;现为Statistica Sinica和JASA副主编,也是Journal of Statistical Theory and Practice的客座主编。在JASA和Annals的国际顶级期刊上发表文章12篇。

详情请见个人主页:http://homepages.math.uic.edu/~minyang/

主要内容:

The big data paradigm has drawn a significant amount of attention in recent years as costs of acquiring and storing data have plummeted. Instead, bottlenecks have been shifted to fast and in-depth analysis. However, this shift has created its own set of problems, the most obvious one is that large datasets are often computationally expensive to process. Proven statistical methods are no longer applicable with extraordinary large data sets due to computational limitations. A critical step in Big Data analysis is data reduction. In this presentation, I will review some existing approaches in data reduction and introduce a new strategy called information-based optimal subdata selection (IBOSS). Under linear models set up, for both moderate and large number of covariates, theoretical results and extensive simulations demonstrate that the IBOSS approach is superior to other approaches in term of parameter estimation and predictive performance. The results show that IBOSS strategy addresses the tradeoff between computation complexity and statistical efficiency adequately. Some ongoing research work as well as some open questions will also be discussed.