38th Symposium on the interface of statistics, computing science, and applications (Interface 2006), Pasadena CA, May 2006

This research was conducted at the Fordham University Robotics and Computer Vision Lab. For more information about graduate programs in Computer Science, see, and the Fordham University Graduate School of Arts and Sciences, see


Computer Engineering | Robotics


Tracking of video targets is the process of estimating the current and predicting the future state of a target from a sequence of video sensor measurements. Multitarget video tracking is complicated by the fact that targets can occlude one another and affect video feature measurements in a highly non-linear and difficult to model fashion., Tracking multiple targets that undergo repeated mutual occlusions is a challenging problem with several issues to be addressed. In this paper we propose a multisensory fusion approach to the problem of multitarget video tracking with occlusion. Each sensory cue is treated as a scoring system on the set of possible target tracks. Scoring behavior is characterized by a rank-score function, defined by Hsu and Taksa [11]. A diversity measure defined by Hsu, Chung and Kristal [7] is used based on the variation in rank-score functions. We describe the importance of using the rank-score function in the combination of multiple scoring systems for tracking multiple targets with repeated target occlusion, in particular in the process of hypothesis pruning and feature selection. We present experimental results for 12 video sequences from a variety of situations that demonstrate that our approach can be used to design a feature and fusion selection criterion that improves video tracking performance for situations with multiple, mutually occluding targets.

Included in

Robotics Commons