Mean-shift and sparse sampling based SMC-PHD filtering for audio informed visual speaker tracking

Kilic, Volkan, Barnard, Mark, Wang, Wenwu, Hilton, Adrian and Kittler, Josef (2016) Mean-shift and sparse sampling based SMC-PHD filtering for audio informed visual speaker tracking. IEEE Transactions on Multimedia, 18(12), pp. 2417-2431. ISSN (print) 1520-9210

Full text available as:
[img]
Preview
Text
Barnard-M-43467-AAM.pdf - Accepted Version

Download (11MB) | Preview

Abstract

The probability hypothesis density (PHD) filter based on sequential Monte Carlo (SMC) approximation (also known as SMC-PHD filter) has proven to be a promising algorithm for multi-speaker tracking. However, it has a heavy computational cost as surviving, spawned and born particles need to be distributed in each frame to model the state of the speakers and to estimate jointly the variable number of speakers with their states. In particular, the computational cost is mostly caused by the born particles as they need to be propagated over the entire image in every frame to detect the new speaker presence in the view of the visual tracker. In this paper, we propose to use audio data to improve the visual SMC-PHD (VSMC-PHD) filter by using the direction of arrival (DOA) angles of the audio sources to determine when to propagate the born particles and re-allocate the surviving and spawned particles. The tracking accuracy of the AV-SMC-PHD algorithm is further improved by using a modified mean-shift algorithm to search and climb density gradients iteratively to find the peak of the probability distribution, and the extra computational complexity introduced by mean-shift is controlled with a sparse sampling technique. These improved algorithms, named as AVMS-SMCPHD and sparse-AVMS-SMC-PHD respectively, are compared systematically with AV-SMC-PHD and V-SMC-PHD based on the AV16.3, AMI and CLEAR datasets.

Item Type: Article
Additional Information: This work was supported by the Engineering and Physical Sciences Research Council [grant numer: EP/K014307/1 and EP/L000539/1].
Research Area: Computer science and informatics
Electrical and electronic engineering
Faculty, School or Research Centre: Faculty of Science, Engineering and Computing
Faculty of Science, Engineering and Computing > School of Computer Science and Mathematics
Faculty of Science, Engineering and Computing (until 2017)
Faculty of Science, Engineering and Computing (until 2017) > School of Computing and Information Systems
Depositing User: Mark Barnard
Date Deposited: 17 Jun 2019 10:38
Last Modified: 17 Jun 2019 10:38
DOI: https://doi.org/10.1109/TMM.2016.2599150
URI: http://eprints.kingston.ac.uk/id/eprint/43467

Actions (Repository Editors)

Item Control Page Item Control Page