论文标题
具有深度自动编码预测组件的序列数据的表示学习
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components
论文作者
论文摘要
我们提出了深度自动编码的预测组件(DAPC) - 基于直觉,序列数据的有用表示应在潜在空间中表现出简单的结构。我们通过最大程度地估算潜在特征序列的预测信息来鼓励这种潜在结构,这是每个时间步骤中过去和将来的窗口之间的共同信息。与对比度学习通常使用的相互信息相反,在高斯假设下,我们采用的预测信息的估计值是精确的。此外,可以计算无需负采样的情况。为了减少强大编码器提取的潜在空间的堕落并从输入中保留有用的信息,我们将预测信息进行正规化,并具有充满挑战的掩盖重建损失。我们证明我们的方法恢复了嘈杂的动力学系统的潜在空间,提取预测任务的预测特征,并在用于在大量未标记数据上预处理编码器时改善自动语音识别。
We propose Deep Autoencoding Predictive Components (DAPC) -- a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space. We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step. In contrast to the mutual information lower bound commonly used by contrastive learning, the estimate of predictive information we adopt is exact under a Gaussian assumption. Additionally, it can be computed without negative sampling. To reduce the degeneracy of the latent space extracted by powerful encoders and keep useful information from the inputs, we regularize predictive information learning with a challenging masked reconstruction loss. We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.