论文标题

苏木精和曙红染色的11种癌症类型的组织病理学图像的分段核数据集

Dataset of Segmented Nuclei in Hematoxylin and Eosin Stained Histopathology Images of 10 Cancer Types

论文作者

Hou, Le, Gupta, Rajarsi, Van Arnam, John S., Zhang, Yuwei, Sivalenka, Kaustubh, Samaras, Dimitris, Kurc, Tahsin M., Saltz, Joel H.

论文摘要

核的分布和外观是诊断和研究癌症的必要标志物。尽管核形态很重要,但缺乏大规模,准确,可公开可访问的核分割数据。为了解决这个问题,我们开发了一条分析管道,该管道将来自多种癌症类型的整个幻灯片组织图像中的核细胞片段分布,并具有质量控制过程。我们已经产生了核分割,从而在癌症基因组地图集中有10种癌症类型的5,060个全幻灯片组织图像。我们工作的一个关键组成部分是我们进行了多级质量控制过程(WSI-LEVEL和图像贴片级),以评估分割结果的质量。图像补丁级质量控制使用了1,356个采样图像贴片的手动分割地面真相数据。我们在这项工作中发布的数据集由来自10种不同TCGA癌症类型的5,060多个TCGA WSI的质量控制的核和1,356个手动分割的TCGA图像贴片,来自相同的10种癌症类型以及其他4种癌症类型。数据可从https://doi.org/10.7937/tcia.2019.4a4dkp9u获得

The distribution and appearance of nuclei are essential markers for the diagnosis and study of cancer. Despite the importance of nuclear morphology, there is a lack of large scale, accurate, publicly accessible nucleus segmentation data. To address this, we developed an analysis pipeline that segments nuclei in whole slide tissue images from multiple cancer types with a quality control process. We have generated nucleus segmentation results in 5,060 Whole Slide Tissue images from 10 cancer types in The Cancer Genome Atlas. One key component of our work is that we carried out a multi-level quality control process (WSI-level and image patch-level), to evaluate the quality of our segmentation results. The image patch-level quality control used manual segmentation ground truth data from 1,356 sampled image patches. The datasets we publish in this work consist of roughly 5 billion quality controlled nuclei from more than 5,060 TCGA WSIs from 10 different TCGA cancer types and 1,356 manually segmented TCGA image patches from the same 10 cancer types plus additional 4 cancer types. Data is available at https://doi.org/10.7937/tcia.2019.4a4dkp9u

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源