Information Fusion 10(2) 2009

This research was conducted at the Fordham University Robotics and Computer Vision Lab. For more information about graduate programs in Computer Science, see, and the Fordham University Graduate School of Arts and Sciences, see


Computer Engineering | Robotics


Video target tracking is the process of estimating the current state, and predicting the future state of a target from a sequence of video sensor measurements. Multitarget video tracking is complicated by the fact that targets can occlude one another, affecting video feature measurements in a highly non-linear and difficult to model fashion. In this paper, we apply a multisensory fusion approach to the problem of multitarget video tracking with occlusion. The approach is based on a data-driven method (CFA) to selecting the features and fusion operations that improve a performance criterion. Each sensory cue is treated as a scoring system. Scoring behavior is characterized by a rankscore function. A diversity measure, based on the variation in rank-score functions, is used to dynamically select the scoring systems and fusion operations that produce the best tracking performance. The relationship between the diversity measure and the tracking accuracy of two fusion operations, a linear score combination and an average rank combination, is evaluated on a set of twelve video sequences. These results demonstrate that using the rank-score characteristic as a diversity measure is an effective method to dynamically select scoring systems and fusion operations that improve the performance of multitarget video tracking with occlusions.

Included in

Robotics Commons