论文标题
通过可解码信息瓶颈学习最佳表示
Learning Optimal Representations with the Decodable Information Bottleneck
论文作者
论文摘要
我们解决了为监督学习而表征和寻找最佳表示的问题。传统上,这个问题是使用信息瓶颈解决的,该信息以解码器 - 不可思议的方式保留了有关目标的信息,在保留有关目标的信息的同时压缩了输入。但是,在机器学习中,我们的目标不是压缩,而是泛化,这与预测家族或感兴趣的解码器(例如线性分类器)密切相关。我们提出了可解码的信息瓶颈(DIB),该信息从所需的预测家族的角度考虑信息保留和压缩。结果,DIB产生了根据预期的测试性能最佳的表示形式,并且可以保证估算。从经验上讲,我们表明该框架可用于在下游分类器上执行较小的概括差距,并预测神经网络的泛化能力。
We address the question of characterizing and finding optimal representations for supervised learning. Traditionally, this question has been tackled using the Information Bottleneck, which compresses the inputs while retaining information about the targets, in a decoder-agnostic fashion. In machine learning, however, our goal is not compression but rather generalization, which is intimately linked to the predictive family or decoder of interest (e.g. linear classifier). We propose the Decodable Information Bottleneck (DIB) that considers information retention and compression from the perspective of the desired predictive family. As a result, DIB gives rise to representations that are optimal in terms of expected test performance and can be estimated with guarantees. Empirically, we show that the framework can be used to enforce a small generalization gap on downstream classifiers and to predict the generalization ability of neural networks.