Preparing a concert system which tracks performer felt emotional state through physiological sensors
Guest blog by Nick Collins
We're now in the later intensive stages of building a new concert system which uses physiological sensors, working title 'BioCombat'. The idea is that Adinda and I will each wear three physiological sensors (skin conductance, heart rate and single channel EEG), and that personalized classifiers can be built to track our emotional state live. There will be a 'battle of emotions' where the two of us have to compete to best feel a target emotion (which changes over time and is dictated by the computer), the winner at a given moment controlling the audiovisual output. Much of the graphics and audio side of the system is built, and the last few weeks of preparation will see integration with the live physiological tracking.
This blog post is about the creation of the classifier itself, a machine learning task. Adinda and I have been recording ourselves in eight different emotional states: "calm", "sad", "annoyed", "scared", "angry", "excited", "happy" and "tender". We've invoked these states in ourselves by listening to personally selected music examples that promote those emotions, and thinking about events in our lives relating to them. Now, I'm testing the classification machinery to see how well different emotional states can be distinguished from the physiological data.
The physiological sensor output isn't used directly, but instead ten derived features (statistics and signal characteristics) are extracted from the sensor data, over windows of a few seconds (the window size can be varied and is one parameter to explore). I use an open source toolbox for the SuperCollider audio programming language I've developed called SCMIR (http://composerprogrammer.com/code.html), with the benefit that it works well for preparing machine listening and learning tasks like signal classification, and once trained up the classifiers can easily be deployed live in the concert.
The process was prototyped already on some early data, using three different machine learning algorithms: a neural net, naive Bayes, and k nearest neighbours classification. Data from each emotional state was labelled with that state; the task was to get the correct state label, given the data. In each case, half the data formed a training set to prepare the algorithm, and its real world generalised performance was tested on the remaining data (e.g. data it hadn't seen in training). A typical decent performance for the naive bayes algorithm on two second window data was around 90% success on the training data (where we'd expect it to do well) and at most 60% on unseen data. Chance is at 1 in 8, 12.5%, so there is definitely an improvement in using the classifier.
However, there is research (Picard et al. 2001, Kim and André 2008, van den Broek et al. 2009) which alludes to the possibility of better performance achievable on individual data (it would be a harder task still if the system had to generalise across human beings since humans can differ substantially in their physiological baseline). The projects described in these papers utilise some different features from those we've chosen, but also aren't explicitly trying to build a concert system. In my experience, trying to bring such sensing and learning technologies to realtime concert use presents particular challenges over and above the purer laboratory research setting. Nonetheless, armed with additional data (collected over multiple sessions over multiple days), the task right now is to see if the classification results can be improved further. It remains a challenging situation; there is more data both to train and also test systems (which may have a nullifying effect), and since the data is over multiple days, there will be within person physiological variation day to day to contend with. The hope, however, is that a more robust classifier can be built for the concert. It remains to be seen if any performance nerves themselves skew the final performance situation!
Picard, R. W., Vyzas, E., & Healey, J. (2001). Toward machine emotional intelligence: Analysis of affective physiological state. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(10), 1175-1191.
Kim, J., & André, E. (2008). Emotion recognition based on physiological changes in music listening. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 30(12), 2067-2083.
van den Broek, E. L., Lisy, V., Westerink, J. H., Schut, M. H., & Tuinenbreijer, K. (2009). Biosignals as an advanced man-machine interface. research report IS15-IS24