论文标题

数据谱系重建的理论模型和实际考虑因素

Theoretical Model and Practical Considerations for Data Lineage Reconstruction

论文作者

Pushkin, Egor

论文摘要

我们生活在一个受数据驱动的世界中。它的数量超出了任何人监督甚至观察其范围的能力。除了数据管理空间中的所有进步外,围绕这些数据生态系统和过程发生的形式主义和标准化仍然很大。为了解决该问题,我们提出了一个用于数据流建模的符号,并根据现实世界中的用例评估了其中一些最常见的应用程序。为了促进未来的工作,我们提供了定义的数据模型的详细参考,并考虑了潜在的编程范例。

We live in a world driven by data. The amount of it outgrows anyone's ability to oversee it or even observe its scope. Along with all the advances in the space of data management, there is still a significant lack of formalism and standardization around defining data ecosystems and processes occurring within those. In order to address the issue we propose a notation for data flow modeling and evaluate some of the most common applications of it based on real-world use cases. To facilitate future work, we provide detailed reference of the data model we defined and consider potential programming paradigms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源