通过智能手机数据自动评估学生的表现

论文标题

通过智能手机数据自动评估学生的表现

Automatically Assessing Students Performance with Smartphone Data

论文作者

Fernandes, J., Silva, J. Sá, Rodrigues, A., Sinche, S., Boavida, F.

论文摘要

随着围绕我们的智能设备的数量增加，创建智能社会意识系统的机会也会增加。在这种情况下，移动设备可用于收集有关学生的数据，并更好地了解他们的日常例程如何影响他们的学习成绩。此外，COVID-19的大流行带来了新的挑战和困难，对学生来说，对他们的生活方式产生了巨大影响。在本文中，我们介绍了使用智能手机应用程序（Isabela）收集的数据集，其中包括被动数据（例如活动和位置）以及来自问卷的自我报告的数据。我们通过不同的机器学习模型进行了几项测试，以便对学生的表现进行分类。这些测试是使用不同的时间窗口进行的，这表明每周的时间窗口可与每月的时间窗口更好地预测和分类结果。此外，还表明，即使从不同上下文中收集的数据，也可以预测学生的绩效，即在COVID-19大流行之前和期间。发现具有随机森林的SVM，Xgboost和Adaboost-Samme是最佳算法，其精度大于78％。此外，我们提出了一条使用决策水平的投票算法来进一步提高模型的性能，通过使用学生的历史数据来进一步改善预测，以进一步改善模型的性能。使用此管道，有可能进一步提高模型的性能，其中一些获得的精度大于90％。

As the number of smart devices that surround us increases, so do the opportunities to create smart socially-aware systems. In this context, mobile devices can be used to collect data about students and to better understand how their day-to-day routines can influence their academic performance. Moreover, the Covid-19 pandemic led to new challenges and difficulties, also for students, with considerable impact on their lifestyle. In this paper we present a dataset collected using a smartphone application (ISABELA), which include passive data (e.g., activity and location) as well as self-reported data from questionnaires. We present several tests with different machine learning models, in order to classify students' performance. These tests were carried out using different time windows, showing that weekly time windows lead to better prediction and classification results than monthly time windows. Furthermore, it is shown that the created models can predict student performance even with data collected from different contexts, namely before and during the Covid-19 pandemic. SVMs, XGBoost and AdaBoost-SAMME with Random Forest were found to be the best algorithms, showing an accuracy greater than 78%. Additionally, we propose a pipeline that uses a decision level median voting algorithm to further improve the models' performance, by using historic data from the students to further improve the prediction. Using this pipeline, it is possible to further increase the performance of the models, with some of them obtaining an accuracy greater than 90%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题