深度视觉跟踪器中对象表示的分析

论文标题

深度视觉跟踪器中对象表示的分析

An Analysis of Object Representations in Deep Visual Trackers

论文作者

Goroshin, Ross, Tompson, Jonathan, Dwibedi, Debidatta

论文摘要

完全卷积的深度相关网络是单个对象视觉跟踪的最新方法的组成部分。通常认为，这些网络通过将对象实例的功能与整个帧的功能匹配，通过检测进行跟踪。人们认为，强大的建筑先验和对物体表示的条件被认为是为了鼓励这种跟踪策略。尽管有这些强大的先验，但我们表明，深层跟踪器通常默认为显着检测跟踪 - 而不依赖对象实例表示。我们的分析表明，尽管是有用的，但显着性检测可以防止深层网络中更健壮的跟踪策略的出现。这使我们引入了一项辅助检测任务，该任务鼓励更具歧视性的对象表示，以改善跟踪性能。

Fully convolutional deep correlation networks are integral components of state-of the-art approaches to single object visual tracking. It is commonly assumed that these networks perform tracking by detection by matching features of the object instance with features of the entire frame. Strong architectural priors and conditioning on the object representation is thought to encourage this tracking strategy. Despite these strong priors, we show that deep trackers often default to tracking by saliency detection - without relying on the object instance representation. Our analysis shows that despite being a useful prior, salience detection can prevent the emergence of more robust tracking strategies in deep networks. This leads us to introduce an auxiliary detection task that encourages more discriminative object representations that improve tracking performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题