论文标题
嵌入和标记的弱标签的局限性
Limitations of weak labels for embedding and tagging
论文作者
论文摘要
Many datasets and approaches in ambient sound analysis use weakly labeled data.Weak labels are employed because annotating every data sample with a strong label is too expensive.Yet, their impact on the performance in comparison to strong labels remains unclear.Indeed, weak labels must often be dealt with at the same time as other challenges, namely multiple labels per sample, unbalanced classes and/or overlapping events.In this paper, we formulate a supervised learning problem which涉及弱标签。我们创建一个数据集,该数据集专注于强标和弱标签之间的差异,而不是其他挑战。我们研究培训嵌入或端到端分类器时弱标签的影响。讨论了不同的实验场景,以提供有关哪些应用对弱标记数据最敏感的信息。
Many datasets and approaches in ambient sound analysis use weakly labeled data.Weak labels are employed because annotating every data sample with a strong label is too expensive.Yet, their impact on the performance in comparison to strong labels remains unclear.Indeed, weak labels must often be dealt with at the same time as other challenges, namely multiple labels per sample, unbalanced classes and/or overlapping events.In this paper, we formulate a supervised learning problem which involves weak labels.We create a dataset that focuses on the difference between strong and weak labels as opposed to other challenges. We investigate the impact of weak labels when training an embedding or an end-to-end classifier.Different experimental scenarios are discussed to provide insights into which applications are most sensitive to weakly labeled data.