主 题:Is a classification procedure good enough?—A goodness-of-fit assessment tool for classification learning 分类方法是否表现得足够好了?一个对分类学习的拟合优度评估工具
主讲人:肯塔基大学张嘉伟助理教授
主持人:统计学院林华珍教授
时间:2024年7月4日(周四)下午4:00-5:00
举办地点:柳林校区弘远楼408会议室
主办单位:统计研究中心和统计学院 科研处
主讲人简介:
张嘉伟,肯塔基大学Dr. Bing Zhang统计系的助理教授。于2022年毕业于明尼苏达大学。研究兴趣包括机器学习、隐私保护有关的分部式学习、深度学习、统计推断、模型选择和诊断。
内容简介:
In recent years, many nontraditional classification methods, such as random forest, boosting, and neural network, have been widely used in applications. Their performance is typically measured in terms of classification accuracy. While the classification error rate and the like are important, they do not address a fundamental question: Is the classification method underfitted? For a general classification procedure, the lack of a parametric assumption makes it challenging to construct proper tests. To overcome this difficulty, we propose a methodology called BAGofT that splits the data into a training set and a validation set. First, the classification procedure to assess is applied to the training set, which is also used to adaptively find a data grouping that reveals the most severe regions of underfitting. Then, based on this grouping, we calculate a test statistic by comparing the estimated success probabilities and the actual observed responses from the validation set. The data splitting guarantees that the size of the test is controlled under the null hypothesis, and the power of the test goes to one as the sample size increases under the alternative hypothesis. For testing parametric classification models, the BAGofT has a broader scope than the existing methods since it is not restricted to specific parametric models (e.g., logistic regression). Extensive simulation studies show the utility of the BAGofT when assessing general classification procedures and its strengths over some existing methods when testing parametric classification models.
近年来,实际应用中有许多广泛采用的非传统分类方法,如随机森林、提升算法和神经网络。它们的性能通常通过分类准确率来衡量,但其没有回答一个基本问题:分类方法是否欠拟合?因为缺乏参数假设,对于一般的分类方法构建合适的检验非常具有挑战性。为了克服这个困难,我们提出了一个检验方法BAGofT。它首先将数据分为训练集和验证集。训练集用于训练待评估的分类方法。该数据集还用于自适应地找到最严重欠拟合区域的一个数据分组。然后,基于该分组,我们通过比较预测值和观测值的差距构建检验统计量。此方法中训练集和验证集的分割控制了假设检验的第一类错误。并且,在备择假设下,随着样本量的增加,检验的统计功效趋于 1。除可应用于一般的分类方法如随机森林和神经网络外,BAGofT 对参数模型的适用范围也比现有方法更广,因为它不限于特定的参数模型(例如,逻辑回归)。模拟研究表明了 BAGofT 在评估一般分类过程中的实用性以及在测试参数分类模型时相对于一些现有方法的优势。