Mean-shift and sparse sampling based SMC-PHD filtering for audio informed visual speaker tracking

Kilic, Volkan, Barnard, Mark, Wang, Wenwu, Hilton, Adrian and Kittler, Josef (2016) Mean-shift and sparse sampling based SMC-PHD filtering for audio informed visual speaker tracking. IEEE Transactions on Multimedia, 18(12), pp. 2417-2431. ISSN (print) 1520-9210

Official URL: https://ieeexplore.ieee.org/document/7539615

Abstract

The probability hypothesis density (PHD) filter based on sequential Monte Carlo (SMC) approximation (also known as SMC-PHD filter) has proven to be a promising algorithm for multi-speaker tracking. However, it has a heavy computational cost as surviving, spawned and born particles need to be distributed in each frame to model the state of the speakers and to estimate jointly the variable number of speakers with their states. In particular, the computational cost is mostly caused by the born particles as they need to be propagated over the entire image in every frame to detect the new speaker presence in the view of the visual tracker. In this paper, we propose to use audio data to improve the visual SMC-PHD (VSMC-PHD) filter by using the direction of arrival (DOA) angles of the audio sources to determine when to propagate the born particles and re-allocate the surviving and spawned particles. The tracking accuracy of the AV-SMC-PHD algorithm is further improved by using a modified mean-shift algorithm to search and climb density gradients iteratively to find the peak of the probability distribution, and the extra computational complexity introduced by mean-shift is controlled with a sparse sampling technique. These improved algorithms, named as AVMS-SMCPHD and sparse-AVMS-SMC-PHD respectively, are compared systematically with AV-SMC-PHD and V-SMC-PHD based on the AV16.3, AMI and CLEAR datasets.

Official URL:	https://ieeexplore.ieee.org/document/7539615
Item Type:	Article
Additional Information:	This work was supported by the Engineering and Physical Sciences Research Council [grant numer: EP/K014307/1 and EP/L000539/1].
Research Area:	Research Areas > Computer science and informatics Research Areas > Electrical and electronic engineering
Faculty, School or Research Centre:	Faculty of Science, Engineering and Computing Faculty of Science, Engineering and Computing > School of Computer Science and Mathematics Faculty of Science, Engineering and Computing (until 2017) Faculty of Science, Engineering and Computing (until 2017) > School of Computing and Information Systems
Date Deposited:	17 Jun 2019 10:38
Last Modified:	06 Oct 2020 15:38
DOI:	https://doi.org/10.1109/TMM.2016.2599150
URI:	https://eprints.kingston.ac.uk/id/eprint/43467

Actions (Repository Editors)

Item Control Page