Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition

Gu, Feng, Florez Revuelta, Francisco, Monekosso, Dorothy and Remagnino, Paolo (2015) Marginalised Stacked Denoising Autoencoders for Robust Representation of Real-Time Multi-View Action Recognition. Sensors, 15(7), pp. 17209-17231. ISSN (online) 1424-8220

Full text available as:
[img] Text
sensors-15-17209.pdf - Published Version
Available under License Creative Commons Attribution.

Download (741kB)


Multi-view action recognition has gained a great interest in video surveillance, human computer interaction, and multimedia retrieval, where multiple cameras of different types are deployed to provide a complementary field of views. Fusion of multiple camera views evidently leads to more robust decisions on both tracking multiple targets and analysing complex human activities, especially where there are occlusions. In this paper, we incorporate the marginalised stacked denoising autoencoders (mSDA) algorithm to further improve the bag of words (BoWs) representation in terms of robustness and usefulness for multi-view action recognition. The resulting representations are fed into three simple fusion strategies as well as a multiple kernel learning algorithm at the classification stage. Based on the internal evaluation, the codebook size of BoWs and the number of layers of mSDA may not significantly affect recognition performance. According to results on three multi-view benchmark datasets, the proposed framework improves recognition performance across all three datasets and outputs record recognition performance, beating the state-of-art algorithms in the literature. It is also capable of performing real-time action recognition at a frame rate ranging from 33 to 45, which could be further improved by using more powerful machines in future applications.

Item Type: Article
Additional Information: This work has been supported by the Ambient Assisted Living Joint Programme and Innovate UK under project “BREATHE - Platform for self-assessment and efficient management for informal caregivers” (AAL-JP-2012-5-045). Special Issue "Select Papers from UCAmI & IWAAL 2014 – The 8th International Conference on Ubiquitous Computing and Ambient Intelligence & the 6th International Workshop on Ambient Assisted Living (UCAmI & IWAAL 2014: Pervasive Sensing Solutions)"
Uncontrolled Keywords: deep learning, marginalised stacked denoising autoencoders, bag of words, multiple kernel learning, multi-view action recognition
Research Area: Computer science and informatics
Faculty, School or Research Centre: Faculty of Science, Engineering and Computing (until 2017)
Faculty of Science, Engineering and Computing (until 2017) > School of Computing and Information Systems
Related URLs:
Depositing User: Francisco Florez Revuelta
Date Deposited: 27 Jul 2015 08:17
Last Modified: 06 Sep 2019 09:42

Actions (Repository Editors)

Item Control Page Item Control Page