论文标题

MLPRO:一种用于为开放式研究问题托管众包机器学习挑战的系统

MLPro: A System for Hosting Crowdsourced Machine Learning Challenges for Open-Ended Research Problems

论文作者

Washington, Peter, Nandkeolyar, Aayush, Yang, Sam

论文摘要

为特定问题开发机器学习(ML)模型的任务是固有的开放式,并且有一组可能的解决方案。 ML开发管道的步骤,例如功能工程,损失功能规范,数据插补和降低维度,要求工程师考虑一系列广泛的,通常是无限的可能性。成功地确定不熟悉的数据集或问题的高性能解决方案,需要将数学能力和创造力融合到发明和重新利用新颖的ML方法上。在这里,我们探讨了举办众包ML挑战的可行性,以促进对开放式研究问题的广度探索,从而将问题解决方案的搜索空间扩展到典型的ML团队可以通过可肯定的调查所能进行的搜索空间。我们开发了MLPRO,该系统将开放式ML编码问题的概念与自动在线代码判断平台的概念结合在一起。为了对该范式进行试点评估,我们众包ML和数据科学从业人员对ML挑战。我们描述了两个独立挑战的结果。我们发现,对于足够不受限制且复杂的问题,许多专家提交了类似的解决方案,但是一些专家提供了独特的解决方案,以优于“典型”解决方案类别。我们建议自动化的专家众包系统(例如MLPRO)有可能加速ML工程创造力。

The task of developing a machine learning (ML) model for a particular problem is inherently open-ended, and there is an unbounded set of possible solutions. Steps of the ML development pipeline, such as feature engineering, loss function specification, data imputation, and dimensionality reduction, require the engineer to consider an extensive and often infinite array of possibilities. Successfully identifying high-performing solutions for an unfamiliar dataset or problem requires a mix of mathematical prowess and creativity applied towards inventing and repurposing novel ML methods. Here, we explore the feasibility of hosting crowdsourced ML challenges to facilitate a breadth-first exploration of open-ended research problems, thereby expanding the search space of problem solutions beyond what a typical ML team could viably investigate. We develop MLPro, a system which combines the notion of open-ended ML coding problems with the concept of an automatic online code judging platform. To conduct a pilot evaluation of this paradigm, we crowdsource several open-ended ML challenges to ML and data science practitioners. We describe results from two separate challenges. We find that for sufficiently unconstrained and complex problems, many experts submit similar solutions, but some experts provide unique solutions which outperform the "typical" solution class. We suggest that automated expert crowdsourcing systems such as MLPro have the potential to accelerate ML engineering creativity.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源