论文标题

无监督的异质耦合学习用于分类代表

Unsupervised Heterogeneous Coupling Learning for Categorical Representation

论文作者

Zhu, Chengzhang, Cao, Longbing, Yin, Jianping

论文摘要

复杂的分类数据通常在层次上与属性和属性值之间的异质关系以及对象之间的耦合。这种价值对目标耦合是异质的,互补的相互作用和分布不一致。对未标记的分类数据表示存在的研究有限,忽略了异质和等级耦合,低估了数据特征和复杂性,并且过度使用冗余的信息等。对未标记的分类数据的深入表示有挑战性,这是具有挑战性的,可以监督这种价值对象的耦合,互补性和不合格的数据,并且需要大量的数据,并且需要大量数据。这项工作引入了一种浅但强大的无监督的异质耦合学习(UNTIE)方法,用于通过解开耦合之间的相互作用并揭示嵌入在每种类型的耦合中的异质分布,来表示耦合的分类数据。 UNTIE有效地优化了W.R.T.内核K-均值的目标函数,用于无监督的表示形式学习异质和分层的对象耦合。理论分析表明,UNTIE可以表示具有最大可分离性的分类数据,同时有效地表示异质耦合并披露其在分类数据中的作用。无国际学院的表示形式对25个具有多元化特征的分类数据集的最先进的分类表示形式和深层表示模型进行了重大的绩效改进。

Complex categorical data is often hierarchically coupled with heterogeneous relationships between attributes and attribute values and the couplings between objects. Such value-to-object couplings are heterogeneous with complementary and inconsistent interactions and distributions. Limited research exists on unlabeled categorical data representations, ignores the heterogeneous and hierarchical couplings, underestimates data characteristics and complexities, and overuses redundant information, etc. The deep representation learning of unlabeled categorical data is challenging, overseeing such value-to-object couplings, complementarity and inconsistency, and requiring large data, disentanglement, and high computational power. This work introduces a shallow but powerful UNsupervised heTerogeneous couplIng lEarning (UNTIE) approach for representing coupled categorical data by untying the interactions between couplings and revealing heterogeneous distributions embedded in each type of couplings. UNTIE is efficiently optimized w.r.t. a kernel k-means objective function for unsupervised representation learning of heterogeneous and hierarchical value-to-object couplings. Theoretical analysis shows that UNTIE can represent categorical data with maximal separability while effectively represent heterogeneous couplings and disclose their roles in categorical data. The UNTIE-learned representations make significant performance improvement against the state-of-the-art categorical representations and deep representation models on 25 categorical data sets with diversified characteristics.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源