In summary, the study by Eschrich and colleagues  is consistent with the rest of the literature on emotion as a memory enhancer. The novel aspect of this study, however, is the finding that musical memory is strongly related to the rated attractiveness and not to the experienced arousal of the musical piece. Thus, emotion enhances not only memories for verbal or pictorial material, as summarized by Buchanan , but also for musical pieces. This study  also provides additional support for a tremendous role of music in building our autobiographical memories. Emotional music we have heard at specific periods of our life is strongly linked to our autobiographical memory and thus is closely involved in forming our view about our own self. In this respect it is interesting to note that listening to music is not only accompanied by blood flow increases in brain areas known to be involved in generating and controlling emotions [12,13], but it is also accompanied by a general increase and change of brain activation within a distributed network comprising many brain areas and the peripheral nervous system [11,24-27]. Thus, listening to music (even when we listen passively) activates many psychological functions (emotion, memory, attention, imagery and so on) located in a distributed, overlapping brain network.
Abstract - This paper suggests the use of modeling techniques to tack into the emotion/cognition paradigm. We presented two possible frameworks focusing on the embodiment basis of emotions. The first one explores the emergence of emotion mechanisms, by establishing the primary conditions of survival and exploring the basic roots of emotional systems. These simulations show the emergence of a stable motivational system with emotional contexts resulting from dynamical categorization of objects in the environment, in answer to survival pressures and homeostatic processes. The second framework uses music as a source of information about the mechanism of emotion and we propose a model based on recurrent connectionist architectures for the prediction of emotional states in response to music experience. Results demonstrate that there are strong relationships between arousal reports and music psychoacoustics, such as tempo and dynamics. Finally we discuss future directions of research on emotions based on cognitive agents and mathematical models.
In 2008, a new domain-speciﬁc categorical emotional model called GEMS (Geneva Emotional Music Scale) was proposed (Zentner et al., 2008). GEMS is unique in that it addresses induced emotion, was created speciﬁcally for describing musical emotion, and has a level of granularity that other models do not provide. Zentner et al. conducted four consecutive studies to derive the model. First, a list of music-related terms was compiled both for induced and perceived emotion. It showed that these two types of emotion differ from each other, the major difference being the bias for positive emotions in case of induced emotions. In the following studies, a structure of music-induced emotions was examined through factor analysis of question- naires. As a result, the GEMS scale was created. Through further factor analysis, shorter versions of the scale were added. The full GEMS scale consists of 45 terms, with shorter versions of 25 and 9 terms. These nine terms can in turn be grouped into 3 superfactors: vitality, sublimity and unease. Originally, the terms were collected in French, and later translated to English. In 2012, an additional research was conducted to improve the short GEMS scale (Coutinho & Scherer, 2012). In this research, the problem of classical music overrepresentation in the original work behind GEMS was addressed. The experiment conﬁrmed the nine-factor structure of GEMS. It was suggested to add new terms related to feelings of harmony, interest and boredom. The ﬁnal results from the study are still unpublished, so we used the original short nine term version of GEMS for our online game (see Table 1).
a song, whether it’s a piece or an entire song, e.g. this song is happy. On the other hand, music induces emotion when a person feels a certain emotion while or after listening to a song, e.g. the listener felt happiness . Research in this field is usually made by introducing different musical pieces to listeners and storing their emotional reaction. Usually, rather than recording reactions on the listeners own words, it is common to use standard ratings . Based on these concepts, one study demonstrated that harken to a music that is pleasurable to the listener will activate identical brain sections that are stimulated only though euphoric stimuli like food, sex or even drugs . Other authors, like Thompson , sustain that music have attributes such as “intensity (loudness), tempo, dissonance, and pitch height”, that can interfere with emotional expressions. They are as a code that when used properly serve as an emotion communicator tool. In order to prop this affirmation the author says that “melodies that are played at a slow tempo tend to evoke emotions with low energy such as sadness, whereas melodies that are played at a fast tempo tend to evoke emotions with high energy, such as anger or joy” .
The main objective of this paper is to design an efficient and accurate algorithm that would generate a playlist based on current emotional state and behavior of the user. Face detection and facial feature extraction from image is the first step in emotion based music player. For the face detection to work effectively, we need to provide an input image which should not be blur and tilted. We have used algorithm that is used for face detection and facial feature extraction. We have generated landmarks points for facial features. The next step is the classification of emotion for which we have used multiclass SVM classification. The generated landmarks points are provided to the SVM for training purpose. The emotion classified by SVM is then passed to music player and accordingly music will be played.
perception across auditory domains, namely gender and emotional intelligence, emotional stability, and musical training. We show that these factors play a role in continuous evaluations of emotion perception with the potential to be integrated into future models. It is worth noting that it is not only the global model performance which is important, but also the errors at each moment in the stimuli. It is plausible to assume that higher error in specific section/portions of music or speech stimuli are at least partially due to the lack of information from the current model inputs and/or inherent limitations of the limited set of stimuli used. In the case of the first, it could indicate that other low-level features or, very likely, “higher-level” features (either stimulus related or individual) would be necessary to describe the emotions perceived by subjects.
The model performance differs for each emotion. That is Joy, Anticipation, Trust, Disgust, Surprise achieve high value of correlation coefficient. which is an exciting result for the Tamil Musicemotion research. To be analysed, in the listening experiment, volunteers report that it is hard to define Angry, Sadness, Fear in Tamil music. As a result, these three emotion words are not clearly perceived by the volunteers and vague emotion rating makes errors in the model building from the source data layer. It is obtained that the Joy and Trust are choose for most of the songs provided. Self-Organising learning could be a suitable method to solve the problem of the imbalanced data classification. These results are valuable that it could contribute to the emotion characters for the Tamil music and make guidance to the future research of the Tamil Music Dataset.
Here wepropose a Emotion based music player(Emo Player).Emo player is an music player which play songs according to the emotion of the user.. It aims to provide user- preferred music with emotion awareness. Emo player is based on the idea of automating much of the interaction between the music player and its user. The emotions are recognized using a machine learning method Support Vector Machine(SVM )algorithm. In machine learning, support vector machines are supervised learning models with associated learning algorithms that analyse data used for classification and regression analysis. It finds an optimal boundary between the possible outputs. The training dataset which we used is Olivetti faces which contain 400 faces and its desired values or parameters. The webcam captures the image of the user. It then extract the facial features of the user from the captured image. The training process involves initializing some random values for say smiling and not smiling of our model, predict the output with those values, then compare it with the model's prediction and then adjust the values so that they match the predictions that were made previously. Evaluation allows the testing of the model against data that has never been seen and used for training and is meant to be representative of how the model might perform when in the real world. According to the emotion, the music will be played from the predefined directories.
In this paper we describe the ideas behind the Music and Emotion Driven Game Engine (M-EDGE), currently under development at the School of Interactive and Digital Media in Nanyang Polytechnic and fully supported by the Singapore National Research Foundation. The paper will explain a possible method for analyzing emotional content in music in real time and how it can successfully be applied to different game ideas to help defining a new interactive experience and music based gameplay in videogames.
Social tags From the different music listening platforms, Last.fm is probably the most popular among academics. This is because of the open API it provides for collecting and analyzing tags, metadata and musical preferences of its users. Consequently, Last.fm user tags appear as an important research resource in many academic works. In general, users tend to provide tags about songs and other online objects for several reasons. Creation of assistance for future searches, expression of opinions and social exposure are some of the most important . One of the first studies that examined type distribution of song tags is . According to this study, 68% of tags are related to song genre, 12% to locale, 5% are mood tags, 4% of them are about instrumentation tags and 4% express opinion. In  on the other hand, we find one of the first studies that specifically examined mood social tags. Authors report an unbalanced distribution of emotion tag vocabulary. They also infer that many labels are interrelated or reveal different views of a common emotion category. In  authors utilized AllMusic tags to create a ground truth dataset of song emotions. They first used tags and their norms in ANEW to categorize each song in one of the four valence-arousal quadrants of Russell’s model. Afterwards, three persons validated and improved annotation quality manually. The resulting dataset has 771 songs and their corresponding emotion category.
109 ‘semantic domination’. Lavignac describes F major as a ‘bucolic’ and ‘pastoral’ key, no doubt because Beethoven chose it for his Sixth Symphony, the ‘Pastoral’. The tuning of instruments, making them easy or difficult to play in certain keys, also contributes to these kinds of associations. For instance, the most ‘pastoral’ of instruments, the horn, was best played in a key around F major prior to the invention of its valved version, hence Beethoven’s decision to name his Sixth Symphony in the way he did (Hewett 2010). Undoubtedly, the most important moment in the codification of emotion in Western music occurred during the Renaissance when composers of madrigals were formulating the tonal ‘language’ that would be used in the first operas. The Cuban music critic, Alejo Carpentier (1972) writes: ‘the madrigal composer managed to create a whole rhetoric of expression which consisted, for example, in causing the voices to rise when singing about heaven and to fall when singing about hell, in weaving “chains of love” with intervals of a third, or in writing in a silence for a “sigh” ’ (1097). This new musical ‘language’ was based on relations between tonic and dominant, that is to say between a chord constructed on the first note of the home key and a chord built on the fifth note of that key. A return to the tonic represented emotional and dramatic resolution, whereas moments of tension were signified by movement away from the tonic, especially when dissonance was employed, as in the work of Gesualdo. In the dramatic settings of early operas such as L’Euridice (1600) by Peri or Monteverdi’s Orfeo (1607), the singers’ physical gestures allowed for the emotional content of musical phrases to be rendered even more explicitly. Christopher Small (1998) argues that the type of iconic musical phrases just described did not represent emotional states per se but rather the kinds of physical gestures, both bodily and vocal, with which they were associated (148). Once the connection with music was firmly established, audiences could understand the significance of such ‘musical
Emotion based music retrieval is a music player based on user’s emotion. Many music devices and mobile music players are used to listen to music. A practical problem is selection of desired music. Nowadays many devices are integrated with cameras. This paper gives how to take advantage of these one camera systems. In this proposed system emotion is derived automatically from hand gestures where hand gestures are captured by camera. The data recorded from hand gestures are coupled with musical emotion. Users can search through music collection based on emotional character of music which expressed from hand gestures. The main aim of this project is to design emotion based user interface for music retrieval. The algorithms were implemented in the system using C++. The system uses the Intel OpenCV library for image processing. This paper is implemented by using ARM9 micro controller. The camera is connected to the controller through USB device. The micro- controller is connected to PC through serial communication. The controller processes the information and monitors the results as controlled songs on PC.
In all four experiments, we computed 20 trials for each model, all with randomized initial weights in the range [- 0.1,0.1]. After preliminary tests, we settled with an architec- ture consisting of 2 hidden layers, each with 12 hidden units (with tangent activation functions in the case of the SRN, and LSTM memory blocks in the case of the LSTM network). The learning rate for all models and trials was 0.001 and momentum 0.9. An early stopping strategy was used to avoid overfitting the training data. Training was stopped after 20 iterations without improvement to the performance of the test set, and a maximum of 2000 total iterations of the learning algorithm was allowed. Each sequence (music piece or speech sample) in the various training sets was presented randomly to the model during training. The input (acoustic features) and output (emotion features) data was standardized to the mean and variance of the correspondent training sets used in each experiment and cross-validation fold of the intra-domain models.
Though music has been found to be suitable to alter a film’s meaning, no attempt has been made to study the mechanism involved in the integration of emotion conveyed by music and visually complex material. We assume that integration of complex affective scenes and effective auditory input takes place later than integration of emotional faces and voices because the affective content of the former is less salient and thereby requires more semantic analysis before their affective meaning can begin to be evaluated. (31) (Spreckelmeyer 161). The answer to this puzzle may lie in the field of Gestalt psychology in which the emphasis is on a holistic approach in which the brain has a self-organizing tendency, and in which the whole is conceptualized as greater than the sum of its parts. As Linda Phyllis Austern points out:
Human emotion representation is still an area of research and music video emotion is a specific to various human emotion specified in . Individual audio or video emotion bases studies [6, 7] are done in past decades and in recently many [4, 8] research primarily focus on both audio and video emotion with early or late fusion. The primary limitation in this area of research is lack of labeled data and the methods based on data driven are not properly conducted. In this research our primary task is introduction of a small music-video dataset that can attract new researcher in this area and solve various problems related to music-video in television data center and several online video banks.
The main objective of this work is to develop a musicemotion recognition technique using Mel frequency cepstral coefficient (MFCC), Auto associative neural network (AANN) and support vector machine (SVM). The emotions taken are anger, happy, sad, fear, and neutral. Music database is collected at 44.1 KHz with 16 bits per sample from various movies and websites related to music. For each emotion 15 music signals are recorded and each one is by 15sec duration. The proposed technique of musicemotion recognition (MER) is done in two phases such, i) Feature extraction, and ii) Classification. Initially, music signal is given to feature extraction phase to extract MFCC features. Second the extracted features are given to Auto associative neural networks (AANN) and support vector machine (SVM) classifiers to categorize the emotions and finally their performance are compared. The experimental results show that MFCC with AANN classifier achieves a recognition rate of about 94.4% and with SVM classifier of about 85.0% thus outperforms SVM classifier.
The dimensional representation offers a completely different solution—time vary- ing MER. Even though it is clearly not restricted to the dimensional approach (as has been shown by Liu [ 2006 ], who suggested that the music piece can be auto- matically segmented into segments of stable emotional content and apply static emotion recognition on them, and Schubert et al. [ 2012 ], who introduced an ap- proach for continuous categorical emotion tagging), the time varying categorical representation is inherently more difficult to use, especially in user studies. Even within dimensional emotion tracking, there are different ways of approach- ing the problem. Korhonen et al. [ 2006 ], Panda and Paiva [ 2011 ], Schmidt and Kim [ 2010 a], Schmidt et al. [ 2010 ], and others have tried to infer the emotion label over an individual short time window. Another solution is to incorporate temporal information in the feature vector either by using features extracted over varying window length for each second/sample [Schubert, 2004 ], or by using machine learning techniques that are adapted for sequential learning (e.g. sequential stack- ing algorithm used by Carvalho and Chao [ 2005 ], Kalman filering or Conditional Random Fields used by Schmidt and Kim [ 2010 b, 2011 a]). Interestingly, it has also been reported by Schmidt et al. [ 2010 ] and Panda and Paiva [ 2011 ] that taking the average of the time-varying emotion produces results that are statistically signi- ficantly better than simply performing emotion recognition on the whole piece of music.
In this study we explored the effect of combining EEG and audio features on the classification accuracy of trained machine learning models to estimate the emotional state based on EEG data using the Emotiv Epoc headset. The categorical model of emotion was adopted compared with the dimensional model which is widely used. We used a film music dataset which was created to induce strong emotions and consisting of five basic emotions: Fear, angry, happy, tender and sad. EEG data was obtained from listening tests of labeled music excerpts. We extracted EEG and Audio features, and later applied machine learning techniques to study the improvement in the emotion recognition task using multimodal features. We evaluated learning curves of the mono and multimodal system to study the behavior of the machine learning systems for different training and test sizes, additionally we evaluated feature selection methods and their effect on the classification accuracies.
The effort of distinguishing between different classes of affective states, and further identifying different types of emotion, does not imply that we postulate sharp and reliable boundaries between these classes. On the contrary, we expect that there are very fuzzy boundaries and we have to consider this domain as consisting of a multi-dimensional space rather than one of sharply defined categories. In addition, the meaning of the respective emotion words for the respective classes (see Table 1) are difficult to define and are multiply interrelated (see Fontaine, Scherer, & Soriano, in press, on the semantics of emotion terms). We also acknowledge that music can produce various types of emotions. Nevertheless, as past work in this area has almost exclusively focused on a small set of basic emotions (applied indiscriminately in music studies), we highlight here the preponderance of those emotions that are aesthetic (and also epistemic) in character (i.e., in cases where appreciation of intrinsic qualities is a determining factor). Aesthetic and epistemic emotion responses to music are in urgent need of further examination, because it is fundamental to understand the nature of the wide variety of emotional responses to music in order to establish music’s relevance for the individual and assert its power to elicit emotions. We do not exclude however that music may in certain circumstances produce utilitarian emotions. Music might produce sadness, fear, and anger in certain cases. However, one cannot be sure that these have the same quality as sadness, fear, and anger being produced by the loss a loved one, a bear coming towards us, or being insulted. Also, we do not consider that aesthetic emotions are limited to music (they are equally found in the visual and dramatic arts). In fact, there is empirical evidence that emotions relating to nostalgia, love, wonder, and transcendence are experienced equally often in nonmusical everyday life contexts compared with music contexts (Zentner et al. 2008, p. 515).
Regarding the musical emotion recognition test, we previously proposed to recognize the emotion evoked by 12 extracts of musical pieces (4 pieces per emotion — joy, fear or sadness) to 84 young people, in order to choose the extracts, whose emotion was clearly recognized by more than 90% of them. For that, we used the following opened question, in order to allow them to freely com- municate the feelings evoked by each musical piece: “ How do you feel by listening to these music excerpts? ” . We noted for each musical extract, the words that in- spired them and when the emotion was clearly recog- nized, the notation was joy, fear or sadness. Thus, we could only select 6 musical extracts according to three primary emotions — 2 extracts per emotion: (a) joy ( “ The Four Seasons — Spring ” by Vivaldi and “ Folies Bergère ” from Paul Lincke, which have been recognized 100% as joy emotion by the young group), (b) sadness ( “The Funeral March” of Chopin and “The Dispute” of Yann Tiersen: 90% and 97%, respectively) and (c) fear (“The theme of Psych- osis” by Bernard Herrmann and “The theme of Teeth of the Sea” by John Williams: 97%). After that, we used those six musical extracts in the musical emotion rec- ognition test. All participant groups had the following instruction for each musical extract: “Choose—on the sheet—the emotion in which it makes him think of joy- sadness-fear”. The maximum score was 6 points (one point per correct answer).