卷积张量训练LSTM用于时空学习

论文标题

卷积张量训练LSTM用于时空学习

Convolutional Tensor-Train LSTM for Spatio-temporal Learning

论文作者

Su, Jiahao, Byeon, Wonmin, Kossaifi, Jean, Huang, Furong, Kautz, Jan, Anandkumar, Animashree

论文摘要

从时空数据中学习有许多应用程序，例如人行为分析，对象跟踪，视频压缩和物理模拟。但是，现有方法在诸如长期预测之类的挑战性视频任务上仍然表现不佳。这是因为这些具有挑战性的任务需要在视频序列中学习长期时空相关性。在本文中，我们提出了一个高阶卷积LSTM模型，该模型可以有效地学习这些相关性以及历史的简洁表示。这是通过一种新型的张量列车模块来完成的，该模块通过跨时间结合卷积特征来执行预测。为了使这种可行的计算和记忆要求，我们提出了一种新型的卷积张量训练训练训练，对高阶模型进行了分解。该分解通过共同近似一系列卷积内核ASA低级张量训练分解来降低模型的复杂性。结果，我们的模型表现优于现有方法，但仅使用一小部分参数，包括基线模型。

Learning from spatio-temporal data has numerous applications such as human-behavior analysis, object tracking, video compression, and physics simulation.However, existing methods still perform poorly on challenging video tasks such as long-term forecasting. This is because these kinds of challenging tasks require learning long-term spatio-temporal correlations in the video sequence. In this paper, we propose a higher-order convolutional LSTM model that can efficiently learn these correlations, along with a succinct representations of the history. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. To make this feasible in terms of computation and memory requirements, we propose a novel convolutional tensor-train decomposition of the higher-order model. This decomposition reduces the model complexity by jointly approximating a sequence of convolutional kernels asa low-rank tensor-train factorization. As a result, our model outperforms existing approaches, but uses only a fraction of parameters, including the baseline models.Our results achieve state-of-the-art performance in a wide range of applications and datasets, including the multi-steps video prediction on the Moving-MNIST-2and KTH action datasets as well as early activity recognition on the Something-Something V2 dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题