Conceiving Human Interaction by Visualising Depth Data of Head Pose Changes and Emotion Recognition via Facial Expressions

Processed frames regarding two DOFs (pitch and yaw), as shown by the main application window for estimating head pose changes.


Affective computing in general and human activity and intention analysis in particular comprise a rapidly-growing field of research. Head pose and emotion changes present serious challenges when applied to player’s training and ludology experience in serious games, or analysis of customer satisfaction regarding broadcast and web services, or monitoring a driver’s attention. Given the increasing prominence and utility of depth sensors, it is now feasible to perform large-scale collection of three-dimensional (3D) data for subsequent analysis. Discriminative random regression forests were selected in order to rapidly and accurately estimate head pose changes in an unconstrained environment. In order to complete the secondary process of recognising four universal dominant facial expressions (happiness, anger, sadness and surprise), emotion recognition via facial expressions (ERFE) was adopted. After that, a lightweight data exchange format (JavaScript Object Notation (JSON)) is employed, in order to manipulate the data extracted from the two aforementioned settings. Motivated by the need to generate comprehensible visual representations from different sets of data, in this paper, we introduce a system capable of monitoring human activity through head pose and emotion changes, utilising an affordable 3D sensing technology (Microsoft Kinect sensor).