Toward Culturally Relevant Emotion
Detection Using Physiological Features
Khadija Zanna
Department of Computer Science and Engineering
University of South Florida
March 13th, 2020
Problem Statement
●
“Emotional distress is shown to have a statistically significant effect on grade
point averages and the intent to drop out.” (Mary E. Pritchard, 2003)
●
High distinction in graduation rates between students of different cultural
backgrounds (as high as 25%) (Dina C. Maramba, 2012 and Dough Shapiro,
2017).
●
Some studies suggest similarities in emotional intelligence, stability, and
motivation amongst demographically similar groups.
Table of Contents
1. Introduction and Research Aims
2. Database Description
3. Feature Extraction
Introduction
●
Two intersecting areas of interest
○
Culturally-relevant emotion recognition
○
Physiological measurements of emotion
●
The use of physiological data for emotion detection has made significant gains in recent
years (A. Sano 2015, 2016, Rui Wang 2018).
●
Extensive exposure and interaction between cultural groups might distort findings
particularly those on in-group advantage, which is particularly likely to occur among
students due to the culturally-diverse nature of college campuses.
Research Aims
●
We explore the use of physiological signals for culturally-relevant emotion
recognition within the college student population.
●
We explore several types of physiological signals captured while
Database
● Several measurements of physiological responses collected by Zhang et. al at the University of Binghamton.
Database
Each subject experienced 10 emotion inducing tasks: ● Happiness (amusement) ● Surprise ● Sadness ● Startle (surprise) ● Skepticism ● Embarrassment ● Fear (nervousness) ● Physical pain ● Anger ● Disgust
During each task, physiological data was collected using a system that captures vital sign signals (blood pressure, respiration rate, heart rate, and electrodermal activity).
● Sample rate of 1000Hz.
Feature Extraction
● Each of the 8 signals were stored in text files for each emotional stimulus (8 files for each (10) stimulus).
● Standardized all physiological data by removing the mean and dividing by variance:
● Extracted 13 statistical features: minimum (min), maximum (max), mean, variance, skewness, kurtosis, min of the derivative, max of the derivative, mean of the derivative, variance of the derivative, skewness of the derivative, kurtosis of the derivative, root mean square (rms) of the derivative per text file, to create a single sample.
Feature Extraction
● We annotated each sample with the respective
physiological signal, race, emotion, gender, and user_id.
● In addition, we used class combinations - race-emotion,
race-gender, and race-gender-emotion
Unsupervised Learning: Clustering
● DBSCAN - Samples near each other are clustered according to a distance metric and a minimum number of surrounding samples.
● Robust to outliers (noise points) and it supports non-globular structures, unlike partitioning methods and hierarchical clustering algorithms, great at separating high density clusters from low density clusters, does not require the number of clusters to be specified priorly.
● EPS values - 0.25, 0.50, 0.75, and 1.00
Unsupervised Learning: Clustering
● Total of 96 runs for every class, 672
experiments were evaluated in total (7 classes * 8 physiological signals * 4 eps values * 3
distance metrics).
● Measured the performance of these
experiments (e.g., if a single partition consisted of data from one class) according to each class. ● We evaluated each physiological signal
separately during these experiments (e.g.,
Clustering Results
Evaluated resulting partitions by the number of clusters and noise points and some clustering metrics scores:
● Homogeneity (HOM)- Quantifies number of clusters with members of a single class (1 - all clusters contain
only members of a single class, 0 - no data points in a cluster belong to a single class).
● Completeness (COM)- Quantifies the data points belonging to a given class that are elements of the same
cluster.
● V-measure (VM)- Measures how successfully the criteria of homogeneity and completeness have been
satisfied.
● Adjusted Rand Index (ARI)- Similarity measure between two clusterings by considering all pairs of samples
and counting pairs that are assigned in the same or different clusters in the predicted and true clusterings.
● Adjusted Mutual Information (AMI)- Measure of the similarity between two labels of the same data.
● Silhouette Coefficient (SC)- Measures how similar an object is to its own cluster (cohesion) compared to
Clustering Results
●
Utilized Multi-Dimensional Scaling (MDS) - a distance preserving dimensionality reduction
technique, to reduce the data to three dimensions for visualizing clusters
●
Extracted 90th percentile (top 10%) of every clustering metric score for each class
evaluation.
Race+Gender+Emotion and User ID
Clustering Results
●
BP_mmHg and EDA_microseimens generally performed better than others at clustering
race, producing the right amount of clusters, and having generally higher clustering
metric scores.
●
Respiration Rate_BPM clusters by race-emotion better than others, and shows up for
most of the classes, with high completeness score.
●
BP_mmHg does better than others for gender according to clustering metric scores, but
BP Dia_mmHg produced more distinct clusters visually.
Supervised Learning: Classification
●
Random Forest Classifier and Support Vector Machine (SVM)
●
Random Forest (n = 100) - High accuracy, handles large data sets with high
dimensionality well, has an effective method for estimating missing data and
maintains its accuracy.
●
SVM - High performance with little tuning needed, memory efficient.
●
Split data into training and testing data with test sizes of 25%, 33%, and 40%.
●
Trained classifier using all 13 features to classify the 7 classes.
Confusion Matrix for Gender
●
82 Females, 58 males
●
0 = Female
Conclusions
● Blood pressure and electrodermal activity generally yielded better clusters regarding race.
● Respiration rate and blood pressure provided better clusters for emotion and a combination of race and emotion.
● The results show that emotion and gender were classified better than the other classes.
● The confusion matrix for emotion gives us more insight on which emotions are better recognized using physiological data (sadness, startle), and which emotions are most mistaken for each other (surprise and skepticism).
● Random forests is a better classifier for our set of physiological features.