Automated Emotional Valence Prediction in Mental Health Text via Deep Transfer Learning

Benjamin Shickel, Martin Heesacker, Sherry Benton, Parisa Rashidi

Abstract: Sentiment analysis is a well-researched field of machine learning and natural language processing generally concerned with determining the degree of positive or negative polarity in free text. Traditionally, such methods have focused on analyzing user opinions directed towards external entities such as products, news, or movies. However, less attention has been paid towards understanding the sentiment of human emotion in the form of internalized thoughts and expressions of self-reflection. Given the rise of public social media platforms and private online therapy services, the opportunity for designing accurate tools to quantify emotional states in is at an all-time high. Based upon findings in psychological research, in this work we propose a new type of sentiment analysis task more appropriate for assessing the valence of human emotion. Rather than assessing text on a single polarity axis ranging from positive to negative, we analyze self-expressive thoughts using a two-dimensional assignment scheme with four sentiment categories: positive, negative, both positive and negative, and neither positive nor negative. This work details the collection of a novel annotated dataset of real-world mental health therapy logs and compares several machine learning methodologies for the accurate classification of emotional valence. We found superior performance using deep transfer learning approaches, and in particular, best results were obtained using the recent breakthrough method of BERT (Bidirectional Encoder Representations from Transformers). Based on these results, it is clear that transfer learning has the potential for greatly improving the accuracy of classifiers in the mental health domain, where labeled data is often scarce. Additionally, we argue that representing emotional sentiment on decoupled valence axes via four classification labels is an appropriate modification of traditional sentiment analysis for mental health tasks.