Home Publications
Home Publications
Keep It Simple And Sparse: Real-Time Action Recognition
Year: 2013 Keywords: action recognition; robot vision; icub; dictionary learning
Authors: Sean Ryan Fanello, Ilaria Gori, Giorgio Metta, Francesca Odone  
Journal: Journal on Machine Learning Research Volume: 14
Pages: 2617-2640
Sparsity has been showed to be one of the most important properties for visual recognition purposes. In this paper we show that sparse representation plays a fundamental role in achieving one-shot learning and real-time recognition of actions. We start off from RGBD images, combine motion and appearance cues and extract state-of-the-art features in a computationally efficient way. The proposed method relies on descriptors based on 3D Histograms of Scene Flow (3DHOFs) and Global Histograms of Oriented Gradient (GHOGs); adaptive sparse coding is applied to capture high-level patterns from data. We then propose a simultaneous on-line video segmentation and recognition of actions using linear SVMs. The main contribution of the paper is an effective real- time system for one-shot action modeling and recognition; the paper highlights the effectiveness of sparse coding techniques to represent 3D actions. We obtain very good results on three different datasets: a benchmark dataset for one-shot action learning (the ChaLearn Gesture Dataset), an in-house dataset acquired by a Kinect sensor including complex actions and gestures differing by small details, and a dataset created for human-robot interaction purposes. Finally we demonstrate that our system is effective also in a human-robot interaction setting and propose a memory game, “All Gestures You Can”, to be played against a humanoid robot.
Digital version