Discriminative Spatial Attention for Robust Tracking

Jialue Fan    Ying Wu    Shengyang Dai

Northwestern University

European Conference on Computer Vision (ECCV) 2010 [PDF | PPT | Executable Code ]


Among many reasons that lead to tracking failure, one of the most difficult cases is due to the distractions in the environment that present similar visual appearances as the target and thus exhibiting good matching to the target. As the distractions produce false positives in target detection, they lead to wrong association to the tracker, and thus fail the tracker. Because they do give good matches to the target, it is difficult to detect such a distraction failure promptly based on their matching scores. In this paper, we addressed the challenging problem: in a region-based (or salient point matching-based) tracking paradigm, what is the optimal placement for the regions in order to achieve reliable tracking? (Note that we do not distinguish region from salient point in the context. As the feature vector extracted by the salient point determines the corresponding neighborhood/region of that point.)


This paper presents a novel and efficient solution to the spatial selection of discriminative attentional regions (ARs). In the feature space, the feature of an AR has a large margin to its nearest neighbors, and we can use this margin in the feature space to represent the discriminative power of an AR. The larger the margin, the more distinctive an AR is in its spatial domain. An AR needs to be distinctive in both its small spatial neighborhood (i.e., local) and a larger domain (i.e., semi-local) that is determined by the possible motion of this attentional region. In the local domain, the local neighbors of an attentional region approximately span a local linear manifold, so that we recast the discriminative power to be a condition number measure of this local linear manifold, and design an efficient gradient-based search for all local ARs. In the semi-local domain, as the approximation does not hold, we design an effective branch-and-bound search that largely reduces the complexity while achieving the optimality.

There are mainly three contributions in our work

Figure 1 illustrates the concept of the discriminative margin.

Figure 1. The discriminative margins for a certain AR.


We list a few notes to highlight some issues in the paper.

Experimental Results

We compared our method with an attentional visual tracker [1] that reported excellent tracking performance.

Click images to play the video. If video does not play, please install DivX video codec.

Dancing [1]. Dancing by our method.


[1] M. Yang, J. Yuan, and Y. Wu. Spatial selection for attentional visual tracking. In CVPR, 2007.

Return Home

Updated 7/2010. Copyright © 2010 Jialue Fan, Ying Wu, and Shengyang Dai