Pykt：一个基于深度学习的知识跟踪模型的Python库

论文标题

Pykt：一个基于深度学习的知识跟踪模型的Python库

pyKT: A Python Library to Benchmark Deep Learning based Knowledge Tracing Models

论文作者

Liu, Zitao, Liu, Qiongqiong, Chen, Jiahao, Huang, Shuyan, Tang, Jiliang, Luo, Weiqi

论文摘要

知识追踪（KT）是使用学生的历史学习互动数据随着时间的推移对知识进行建模的任务，以便对他们未来的互动绩效进行预测。最近，使用各种深度学习技术来解决KT问题已经取得了显着的进步。但是，基于深度学习的知识追踪（DLKT）方法的成功仍然是未知的，对这些DLKT方法的适当测量和分析仍然是一个挑战。首先，现有作品中的数据预处理程序通常是私人和自定义，这限制了实验标准化。此外，现有的DLKT研究通常在评估方案方面有所不同，并且是现实世界中的教育环境。为了解决这些问题，我们引入了一个综合基于Python的基准平台\ TextSc {Pykt}，以确保通过彻底评估进行跨DLKT方法的有效比较。 \ textsc {pykt}库由不同域的7个流行数据集上的一组标准化的数据预处理程序组成，而10个经常比较了用于透明实验的DLKT模型实现。我们细粒度和严格的经验KT研究的结果产生了一系列观察结果和有效DLKT的建议，例如，错误的评估设置可能会导致标签泄漏，这通常会导致性能膨胀；与Piech等人提出的第一个DLKT模型相比，许多DLKT方法的改进是最小的。 \ cite {piech2015 -deep}。我们已经开源\ textsc {pykt}，并在https://pykt.org/上进行了实验结果。我们欢迎其他研究小组和从业人员的贡献。

Knowledge tracing (KT) is the task of using students' historical learning interaction data to model their knowledge mastery over time so as to make predictions on their future interaction performance. Recently, remarkable progress has been made of using various deep learning techniques to solve the KT problem. However, the success behind deep learning based knowledge tracing (DLKT) approaches is still left somewhat unknown and proper measurement and analysis of these DLKT approaches remain a challenge. First, data preprocessing procedures in existing works are often private and custom, which limits experimental standardization. Furthermore, existing DLKT studies often differ in terms of the evaluation protocol and are far away real-world educational contexts. To address these problems, we introduce a comprehensive python based benchmark platform, \textsc{pyKT}, to guarantee valid comparisons across DLKT methods via thorough evaluations. The \textsc{pyKT} library consists of a standardized set of integrated data preprocessing procedures on 7 popular datasets across different domains, and 10 frequently compared DLKT model implementations for transparent experiments. Results from our fine-grained and rigorous empirical KT studies yield a set of observations and suggestions for effective DLKT, e.g., wrong evaluation setting may cause label leakage that generally leads to performance inflation; and the improvement of many DLKT approaches is minimal compared to the very first DLKT model proposed by Piech et al. \cite{piech2015deep}. We have open sourced \textsc{pyKT} and our experimental results at https://pykt.org/. We welcome contributions from other research groups and practitioners.

下载PDF全文

下载文献需遵守相关版权规定

论文标题