Some studies have shown that light can make us feel happy, calm, and putpeople undergoing treatment for emotional and behavioral disorders in a better mood.. In terms of architecture,the
Abstract 2 2222 00-2205
In the last decade, societal pressure on individuals has intensified, leading to a range of stressors and instability that can result in significant physical and mental health issues, ultimately diminishing quality of life High school students face immense pressure from academic scores, college entrance exams, and scholarship competitions, while graduates strive to secure good jobs and establish their societal status In the workplace, the rush to meet deadlines, achieve salary goals, and attain promotions adds to the stress Even within the family, which is typically a source of comfort, individuals grapple with the expectations of loved ones Additionally, personal pressures to improve oneself can lead to feelings of inadequacy Social media platforms like Facebook, Twitter, and Zalo contribute to this emotional turmoil, often exacerbating feelings of sadness, loneliness, and disconnection in a virtual world.
Research from the University of Michigan in 2015 indicates a direct link between boredom and frustration in youth and their time spent on social networks Similarly, a Korean study dubbed "Internet Paradise" highlights that many online gaming addicts experience various pressures, using games as an escape from reality These factors contribute to significant psychological and emotional issues If left unaddressed, such emotions can escalate into serious problems, including sleep disturbances, suicidal thoughts, irritability, anger, and negative mental health conditions like bipolar disorder, personality disorders, schizophrenia, memory loss, and dementia.
The World Health Organization (WHO) reports that over 300 million people globally, representing approximately 4.4% of the population, are affected by mental disorders, with depression being the most prevalent This condition contributes to more than 800,000 suicides annually, making it the second most critical group of diseases after cardiovascular issues in 2020.
Cardiovascular diseases significantly impact human life, with over 3.5 million Vietnamese individuals facing mental disorders, including bipolar disorder, according to the World Health Organization Tragically, approximately 40,000 of these individuals choose to end their lives each year.
40.000 người ty tử ° do trầm cảm mỗi năm
Nguốn Viện sức khổ Bệnh viện Boch Mal
Fig 1 1: Rate of depression in Vietnam[20]
Numerous psychological studies indicate that music and light play a crucial role in addressing human emotional challenges and negative psychological states, serving as vital connections between individuals.
Light plays a crucial role in reflecting various facets of reality by delving into our inner world and expressing human emotions Research has shown that light therapy can significantly benefit individuals suffering from depression, anxiety, and work-related stress Additionally, music therapy has been found to alleviate muscle tension, boost self-esteem, reduce anxiety, enhance interpersonal relationships, and promote safe emotional expression.
Emotional regulation is crucial for overall well-being, and the combination of rhythm and melody in music therapy engages the patient's senses, promoting relaxation and stabilizing breathing and heart rate This approach helps alleviate stress from various sources, particularly through innovative solutions in caring for individuals with psychological and emotional disorders.
Doan Van Nghia 2 and implementation need to be done A system that plays the role of assisting users in regulating their emotions is necessary.
Objectives of the topic
The meaning of the topic
Engaging in positive musical experiences can significantly enhance both physical and mental health Repetitive, improvised, and easily accessible music allows patients to develop various skills, including perceptual, cognitive, motor, social, and emotional abilities In psychology, music therapy has been shown to calm the mind, alleviate pain, reduce stress, and support cancer treatment Additionally, it helps stabilize heart rate, regulate blood pressure, and balance breathing rates, ultimately leading to improved emotional stability and overall well-being for patients.
THE THEORY ~-======================================m= 4 2.1 Human emotions through facial expressions
Happiness
Happiness is a desired emotional state that many strive to achieve, as it fosters comfort and relaxation while alleviating stress.
A genuine smile is characterized by raised cheekbones and distinctive dimples, with the corners of the lips lifted and an open mouth revealing teeth The cheeks may appear slightly elevated, and wrinkles can form at the eyelids and corners of the eyes When expressing happiness, the body exhibits increased flexibility, and the voice may sound cheerful, gentle, or even elevated in volume.
Sadness
Sadness encompasses feelings of grief, disappointment, and depression, often leading to a diminished interest in life Everyone experiences moments of sadness, which can stem from various events While some instances of sadness are temporary, others may persist longer, impacting overall well-being.
Persistent sadness can escalate into severe depression, presenting in various forms such as apathy, indifference, and unusual silence Individuals may exhibit signs like boredom, melancholy expressions, and self-isolation, often retreating into their rooms and experiencing crying spells The intensity of sadness varies based on its root causes and individual coping mechanisms, potentially leading to isolation from society and fostering negative thoughts and self-harming behaviors.
Expressions of this emotion are facial muscles losing tension, inner eyebrows raised, mouth drooping, eyes drooping, forehead wrinkled, facial expression clearly showing sadness, depression accompanied by a sigh.
Disgust
Disgust is a fairly common emotion expressed in many different forms:
—Body language: turn away from the object of disgust Physical reactions: vomiting, indigestion.
—Facial expression: wrinkled nose and curved upper lip.
Disgust can arise from various stimuli, including specific smells, images, and scenes Researchers suggest that this emotion may have evolved as a protective response to harmful or potentially dangerous foods For instance, encountering the smell or taste of spoiled food often triggers strong feelings of disgust, accompanied by characteristic reactions.
This response aims to guide individuals in avoiding behaviors that could lead to infectious diseases Additionally, witnessing immoral actions can evoke feelings of disgust, leaving observers feeling uncomfortable or disturbed.
Fear -2 25-22-2252 nnn nnn nnn nnn nnn nnn nnn nnn ncn 9
Fear is a vital emotion that significantly influences human survival by triggering the acute stress response when confronted with danger This response leads to increased heart and breathing rates, muscle tension, and heightened alertness, preparing the body to either flee from or confront threats While some individuals may experience panic and heightened sensitivity, others demonstrate resilience, and some actively seek out fear through thrilling activities and extreme sports.
The emotion of fear will be manifested by features such as wide eyes, retracted chin, trying to hide or deny the threatening event, rapid breathing, rapid heartbeat, sweating, tremors
Anger ==-=======rrr=rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr=rrrrmr=mer 10
Anger is a powerful emotion characterized by agitation, hostility, and frustration towards specific situations or objects Like fear, it serves as a fight-or-flight response in the body, highlighting its significance in our emotional landscape.
When faced with a looming threat, we often experience anger, prompting a readiness to confront challenges in order to protect ourselves and others This anger can manifest in various ways, including scowling, glaring, harsh speech, and physical reactions like shouting, blushing, sweating, or even aggressive actions such as throwing objects or striking out.
Anger is often perceived as a negative emotion, but it can also serve a constructive purpose by helping us identify our relationship needs and motivating us to address significant issues However, when anger is expressed intensely, it becomes unhealthy and can lead to dangerous and harmful consequences.
Uncontrollable anger can escalate into violent and aggressive behaviors, resulting in significant mental and physical harm This intense emotion not only impairs rational decision-making but can also disrupt and delay important plans.
Surprised
Surprise is a fundamental emotion experienced by nearly everyone, though it is typically fleeting When surprised, individuals may exhibit various physical reactions, including widened eyes, raised eyebrows, open mouths, jumping, screaming, or remaining still Experts suggest that surprise can elicit a hit-or-miss response, as the body releases adrenaline to prepare for a fight-or-flight reaction.
Surprising emotions can be categorized as positive, negative, or neutral, depending on the circumstances Positive surprises include unexpected visits from friends or surprise birthday celebrations, which can bring joy and excitement Conversely, negative surprises, such as being startled by someone unexpectedly at night, can evoke fear or discomfort.
Surprise significantly influences human behavior, as research indicates that individuals are naturally drawn to unexpected events and situations This inherent interest in surprises explains why they capture widespread attention and stand out in social contexts.
Some factors affect emotions
Emotions can often feel uncontrollable, leading us to question whether feelings like sadness, anger, or surprise are truly natural or influenced by external factors Our moods are consistently shaped by a variety of influences, highlighting the complexity of human emotions.
Research indicates a connection between weather and negative emotions, highlighting that sunlight directly influences fatigue levels Exposure to natural light boosts Vitamin D3 production, which in turn affects serotonin, the hormone responsible for mood regulation Consequently, seasonal depression is more prevalent during the darker months when serotonin levels tend to drop.
Social media stimulates the brain's reward system by triggering the release of dopamine, the chemical linked to pleasurable experiences such as eating or sex This mechanism is a primary factor in social media addiction A 2018 study from the UK revealed that excessive use of social networks can negatively impact sleep, resulting in increased anxiety, depression, and diminished memory and learning capabilities.
- Sleep quality: Lack of sleep has a significant effect on people's moods.
Subjects who slept only 4.5 hours a night reported feeling stressed, angry, sad, and mentally exhausted after a week Trouble sleeping is sometimes the
Doan Van Nghia 12 first symptom of depression Studies have found that 15% to 20% of people diagnosed with insomnia will experience major depression.
Individuals with personality disorders experience elevated cortisol levels, which can lead to disruptions in the Hypothalamic-Pituitary-Adrenal (HPA) axis, resulting in significant mood swings.
Research indicates that the attractiveness of a space is significantly influenced by light, with familiar lighting providing comfort but potentially leading to boredom, while unfamiliar lighting can spark curiosity despite initial discomfort Our eyes naturally gravitate towards the brightest areas, and the interplay of light, mood, and human emotions encompasses visual perception, physical response, and psychological impact For instance, the appropriate red light can enhance happiness, while green light can soothe feelings of depression Light and color not only express moods but also fulfill psychological needs, as supported by modern art and medical theories highlighting their physiological effects Different colors have distinct impacts on human physiology: red stimulates the nervous system and boosts circulation, green aids digestion and promotes calmness, blue lowers pulse rates and balances the body, while purple may depress motor nerve and cardiac function The advanced integration of LED technology reflects a profound understanding of the roles of lighting and color.
Doan Van Nghia 13 optical theory and color theory, thus creating a mode of “emotional illumination”.
Listening to a diverse range of music significantly enhances focus, regulates mood during stressful times, and elevates emotions when feeling down Music serves as a powerful stress reliever, fostering relaxation and mental clarity It provides an escape from stress and aids in approaching challenges with a fresh perspective Studies indicate that even classic rock from the 1960s and 1970s can effectively reduce blood pressure, particularly for those experiencing stress-related hypertension Additionally, music has been shown to decrease cortisol levels, the hormone associated with stress, regardless of individual musical preferences.
Light and human emotion
Light is essential for life on Earth, serving as the primary source of energy for all living organisms Plants rely on light to synthesize nutrients through chlorophyll, playing a crucial role in purifying the air we breathe In turn, animals and humans depend on plants for food, highlighting the interconnectedness of life Without light, life as we know it would not exist, underscoring the vital importance of light for plants, animals, and humans alike.
2.3.1 How does light affect humans?
The idea that light affects mood and behavioral state is not new In the paper
Light plays a crucial role in regulating mood and behavioral states by influencing various physiological processes It affects the pupillary light reflex, regulates sleep propensity, and enhances human alertness, all of which are essential for maintaining a healthy circadian rhythm The impact of light on both physiology and behavior is significant, highlighting its importance in our daily lives.
Circadian rhythm is a natural internal process that regulates the sleep-wake cycle, repeating every 24 hours It plays a crucial role in hormone release, eating habits, digestion, and body temperature This biological clock aligns mammalian physiology and behavior with the 24-hour light-dark cycle Disruptions, such as exposure to light at night, can shift this rhythm, negatively impacting physiological and behavioral states and potentially leading to mood disorders.
Highest blood temperature recall Circadian
Greatest cardio- Sharpest blood vascular efficiency pressure rise and muscle strength ce
_“ĂG Secretion stops reaction time
14.30 10.00 Best 12:00 Highest coordination alertness noon
Intrinsically photosensitive retinal ganglion cells (pRGCs) are a specialized group of retinal ganglion cells that serve as the final output neurons of the vertebrate retina These cells play a crucial role in regulating mood and behavioral states in response to light, complementing their established functions in non-visual responses such as the pupillary light reflex and circadian photoentrainment.
Intrinsically photosensitive retinal ganglion cells (ipRGCs) are directly linked to the suprachiasmatic nucleus (SCN), a small but crucial area in the human brain that regulates circadian rhythms and acts as a master clock, synchronizing all biological clocks within the body Additionally, the SCN connects to the mood regulation center, influencing both mood and behavioral states.
Circadian rhythms may influence mood and behavior through connections between the suprachiasmatic nucleus (SCN) and mood-regulating centers Additionally, light exposure can impact human mood and behavior by regulating these circadian rhythms.
2.3.2 Regulating human emotion based on light Based on the paper “Promises and problems with the circumplex model of emotion”, we know that there is 6 basic emotion for human, in turn, surprise, happiness, sadness, disgust, anger, fear.
The study titled “Led Strip for Color and Illumination-Based Emotion Regulation at Home” investigates how different lighting affects human emotions Conducted with participants aged 18 to 55, the research explores the relationship between light color and intensity and emotional responses, highlighting the potential of LED lighting to regulate feelings in a home environment.
A 25-square-meter room was prepared for an experiment where lights in red, orange, yellow, pink, green, blue, violet, and white were illuminated sequentially for 10 seconds each Participants then completed a survey to provide feedback on their experience.
Does the light make you feel one of these emotions? Please, mark the intensity of your feelina for each of the emotions below
Joy Not at al Extreamely
Based on the result of this experiment, there is exist a diagram for changing the emotion of the human.
Affect A Yd Affect Tempo = 90, 120, 150 bpm
` Í Rhythmic unit = whole and half eighth notes @ © Q ® 6 oo@o
Unpleasant ; z = Pleasant ® Yellow Light blue
Fig 2 12: Diagram for changing emotion [1]
The six basic human emotions can be categorized into four groups: Activated Unpleasant Affect, Unactivated Unpleasant Affect, Activated Pleasant Affect, and Unactivated Pleasant Affect Among these, Unactivated Pleasant Affect is considered a healthy emotional state, highlighting the importance of maintaining a pleasant disposition whenever possible.
To achieve a healthy affective state, change the LED strip from Activated Unpleasant Affect to Blue, Light Blue, or Purple, followed by Pink, Green, or Yellow.
If the first state is Unactivated Unpleasant Affect, Led strip needs to be changed to Pink, Green or Yellow to archive the health affective state.
If the first state is Activated Pleasant Affect, Led strip needs to be changed toBlue and then to Yellow.
Emotions and changes in heart rate
Heart rate serves as a valuable indicator of an individual's capacity to recognize emotions and navigate social interactions However, it's essential to note that heart rate should not be viewed as a consistent emotional marker, as its fluctuations may not always correlate with emotional states.
Doan Van Nghia, 18, emphasizes that while data gathered over time can offer insights into the overall response to a stimulus, it fails to accurately capture an individual's feelings during that exposure.
To test how emotions impact heart rate, there was a survey of fifty participants aged 22 to 26 with seven different emotions including: Happiness, sadness, surprise, normal, fear, anger, disgust as Table 2 1
Table 2 1: Emotional Heart Rate Statistics
SYSTEM ANALYSIS AND DESIGN
Facial expression recognition
Every year, numerous intelligent systems are developed to address face analysis challenges, including age detection, gender identification, ethnicity recognition, and emotion prediction The field sees the publication of thousands of research papers, with notable contributions like "Age and Gender Classification using Convolutional Neural Networks," presented at the IEEE Workshop on Analysis and Modeling of Faces and Gestures (AMFG) during the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) in Boston, 2015 Utilizing the FG-NET dataset and advanced deep convolutional neural networks (CNN), researchers have achieved significant improvements in age and gender classification performance.
Recognizing human emotions through facial expressions is a challenging task, as it can be difficult to differentiate between feelings like fear, surprise, sadness, or neutrality just by looking at someone's face Current methods for emotion detection, such as those outlined in recent research papers, often require significant hardware resources, making them unsuitable for mobile devices and embedded systems like Jetson Nano or Raspberry Pi However, a promising solution has emerged in the form of a study focused on facial expression and attribute recognition using multi-task learning with lightweight neural networks This approach, which utilizes MobileNet and the AffectNet datasets, has achieved state-of-the-art results in emotion classification while being efficient enough for practical applications.
3.1.1 Convolutional Neural Networks (CNN) Convolutional Neural Networks or CNN is a deep learning model and class of deep learning neural networks It is used in computer vision applications such as image classification With three-layer, input layer, hidden layers and output layer, CNN can take an image as an input, assign learnable as well as bias to several objects in the image and differentiate one from the other. input layer hidden layer 1 hidden layer 2
Fig 3 1: A regular 3-layer Neural Network [6]
Neural Networks process input data through multiple hidden layers, where each layer consists of independent neurons fully connected to the previous layer The final layer, known as the output layer, generates class scores in classification tasks.
3.1.2 Basic layer of CNN The previous section has described the simple architecture of CNN, and in this section, we will discuss more detail about it As we know, an CNN has three-layer: input layer, hidden layer, and output layer To build this architecture, we used three- main types of layers including convolutional layer, pooling layer and fully connected layer.
Doan Van Nghia 21 from Slide 12, Introduction to Convolutional Neural Networks(Stanford University, 2018)
The convolutional layer serves as the foundational component of Convolutional Neural Networks (CNNs), focusing on feature extraction from input images By applying a filter to the image matrix through a mathematical operation, it generates a feature map that highlights essential characteristics of the image.
In convolutional neural networks (CNNs), fillers represent the weights assigned to the pixel values of an input image, which are optimized through back-propagation during training Various convolutional filters are utilized to extract distinct features, including curves, edges, and colors, and it is the integration of these feature maps that drives the predictions made by CNNs.
Fig 3 3: Example feature extraction-edge detection[6]
The kernel is slid over the input image and based on the weights of the Kernel, the result of the feature is different.
3.1.4 Pooling The pooling layer is usually inserted between convolutional layers The main role of this layer is to reduce the computation complexity required to process the huge volume of data linked to an image by applying a non-linear down-sampling on the convolved feature The result of this operation is usually called activation maps.
The pooling layer comes in two primary types: Max Pooling, which extracts the maximum value from the region of the image defined by the Pooling Kernel, and Average Pooling, which calculates the average of the values within that same area.
Max Pooling and Average Pooling differ primarily in how they handle information from pooled feature maps; Max Pooling discards less significant data, retaining only the maximum value, whereas Average Pooling preserves more information by calculating the average of the pooled values.
Data -2 2222-2 n nn nnn nnn nn nnn nnnn nnn n nnn nn ncnnee 25
3.2.1 Data overview Over 10 years nearly, emotional computing develops fast than ever Besides that, the increase in the number of cameras, as well as the power of hardware resources, lead to the creation several datasets for human emotion classification. Some famous of them are FER2013, Emotion-net, Affect-net, CK+.
An emotion recognition dataset comprises images or videos of human faces, each labeled to reflect the corresponding emotional state These datasets are primarily utilized for training, testing, or validating machine learning and deep learning models Most of these datasets are developed based on the foundational theories of Paul Ekman and Armindo Freitas-Magalhaes, which categorize basic emotions such as anger, fear, disgust, surprise, happiness, and sadness.
Name of dataset Number of samples Emotion
The article discusses various emotion recognition datasets that focus on the six basic emotions along with neutral expressions Notable datasets include CK+ with 593 images, JAFFE featuring 213 images, and MMI, which offers 740 images and 2,900 videos The FER-2013 dataset is extensive, containing 35,887 images, while AFEW 7.0 includes 1,809 videos Additionally, SFEW 2.0 provides 1,766 images, and Oulu-Casia comprises 2,880 series of images Lastly, AffectNet stands out with a remarkable 450,000 images, all representing the six basic emotions and neutral states.
3.2.2 FER-2013 The main dataset in this thesis is FER-2013, this dataset is used to train, test as well as validate the model This dataset is prepared by Pierre-Luc Carrier and Aaron Courville, and published in Challenges in Representation Learning: Facial Expression Recognition Challenge [10] held by Kaggle.
The FER-2013 dataset comprises 35,887 images, each measuring 48x48 pixels, with 28,709 images designated for model training and 3,859 images reserved for testing The images are categorized into seven emotional classes: Angry (0), Disgust (1), Fear (2), Happy (3), Sad (4), Surprise (5), and Neutral (6) A significant challenge of this dataset is the imbalanced distribution of images across the classes, with Happy being the most represented class at 7,215 images, while Disgust has the fewest at just 436 images, illustrating a disparity of 1/16 Additionally, some images are misclassified, and others are unclear, only showing partial facial features.
This dataset is sourced from well-known websites like Google Images using keywords such as "Vietnamese student," "Vietnamese smile," and "Vietnam TV show." Additionally, it includes data extracted from over 50 videos on YouTube and other video platforms The data collection process is divided into two distinct phases.
- Collecting frames from video: capture 1 frame of video after 5 frames to avoid duplicate data.
- Label data: complete by 2 Vietnamese people, look at each face from data and label it with one of seven class (0 = Angry, 1 = Disgust, 2 = Fear, 3 = Happy, 4 = Sad,
5 = Surprise, 6 = Neutral) 5359 images are labeled in this step.
System architecture Overview
Fig 3 10: System Diagram This system can be separated into three-layer:
The first layer of the basic emotion detection system involves capturing a 640x480 image from a camera, which is then processed by a deep learning model for face detection This model identifies the location of the face within the frame Subsequently, the detected face is analyzed by a second deep learning model that assesses the current emotional state, returning the identified affective state to the system's second layer.
The decision-making layer of the system is responsible for selecting the appropriate sound and light scripts based on the emotional state received from the first layer There are four distinct scripts corresponding to four emotional states: activated unpleasant affect, unactivated unpleasant affect, unactivated pleasant affect, and activated pleasant affect.
Fig 3 11: Circumplex model for classes of emotion [1]
The actuator control layer serves as the final component of the system, functioning as the controller for both the speaker and LED strip It utilizes the script from the preceding layer to determine the color and intensity of the LED strip, as well as the music to be played through the speaker.
Fig 3 12: Diagram of the basic emotion detection layer
The basic emotion detection layer serves as the initial component of the system, processing an image captured by the camera after a 30-second interval It identifies and categorizes the emotional state into one of four classifications: activated unpleasant affect, unactivated unpleasant affect, unactivated pleasant affect, or activated pleasant affect.
This layer can be separated into 2 parts: face detection and emotion classification.
3.3.1.1 Face detection Face detection takes responsibility for detecting the location of faces on an input image and is built based on Mediapipe, a Framework, published and developed by Google for building machine learning pipelines for processing time-series data like video, audio, etc This cross-platform Framework works in Desktop/Server, Android, iOS, and embedded devices like Raspberry Pi and Jetson Nano.
The process begins with an image captured by a camera, featuring a resolution of 640x480 pixels This image is then fed into the Mediapipe deep learning model, which processes the input and returns cropped faces extracted from the original image.
Emotion classification serves as the second component of the initial layer, tasked with identifying emotions from the facial input generated by the face detection stage.
The diagram in Figure 3.13 illustrates the classification of emotions, derived from the findings of the study titled "Multi-task Learning of Lightweight Neural Networks." This research introduces a multi-task neural network designed to address various facial attribute recognition challenges The model is constructed through a series of systematic steps to enhance its effectiveness in emotion classification.
The CNN was trained on the VGGFace2 dataset, which consists of 3,067,564 images from 9,131 subjects, with a testing set of 243,722 images A new fully connected layer with 9,131 outputs and softmax activation was added to the pre-trained ImageNet network Following this, the base network's weights were frozen while the head was trained for one epoch.
The categorical cross-entropy loss function was optimized using Sharpness-Aware Minimization (SAM) and the Adam optimizer, with an initial learning rate of 0.001 The entire convolutional neural network (CNN) was subsequently trained for 10 epochs, reducing the learning rate to 0.0001 The models achieved impressive validation accuracies of 92.1% for MobileNet-v1, 95.4%/95.6% for EfficientNet-B0/B2, and 96.7% for RexNet.
The network is specifically fine-tuned for emotion recognition using the FER-2013 and Vietnamese datasets, distinguishing this thesis from the foundational paper, which utilized the AffectNet dataset AffectNet comprises approximately 400,000 images across eight emotional classes—Neutral, Happy, Sad, Surprise, Fear, Anger, Disgust, and Contempt—along with seven primary emotion expressions, excluding Contempt.
3.3.2 Decision making layer This is the second layer of the architecture Based on the affective state from the first layer (Basic emotion detection), this layer takes responsibility for determining which script of the light and the sound should be played These scripts are created based on the series of papers from Spain, some of papers are: Led strips for color- and illumination-based emotion regulation at home, in: Ambient Assisted Living [2], Evaluation of color preference for emotion regulation, in: Artificial Computation in Biology and Medicine, Springer, 2015, pp 479-487.[16].A review on the role of color and light in affective computing [17].Does color say something about emotions?[18] The detail of scrips has been described in the chapter 1.4.2
3.3.3 Actuator control Actuator control is the final layer of the architecture, this layer takes responsibility for controlling the speaker and the led strip The input of this layer is the light and sound script from the second layer (Decision marking layer).
This layer utilizes the Neopixel library, designed for controlling RGB LED strips across various platforms like Arduino and Raspberry Pi It employs the HSV (hue, saturation, value) color space for precise color and intensity calculations, converting these parameters to RGB before sending the values to the LED strip driver.
System algorithm is separated into three phases:
Initializing and config Initializing openCV object phase
Initializing face detector and emotion classification object
Initializing RGB value for controlling Led strip Ì Text
- Initalizing phase: this phase executes when the system start In this phase, global variables and objects of this system will be declared, configurated, and assigned a default value.
- Detecting emotion phase: this phase starts after initializing phase finishes.
In this phase, a 640x480 frame is captured from the camera and processed using two deep learning models: face detection and emotion classification The face detection model extracts and crops the face from the captured image, while the emotion classification model analyzes the cropped face to determine the affective state The final output of this phase serves as input for the subsequent phase.
Extract face from frame True mm
Control light and sound phase
Fig 3 16: Diagram detecting emotion phase
- Control sound and light phase: based on the affective state from detecting emotion phase, sound and light will be executed as section 2.9.2 described.
Control light and sound phas
Determine light and sound script
Create sub-process for playing mp3 files
Receive stop signal from keyboard
To implement this system, the following hardware resources are used:
Hardware resources
To implement this system, the following hardware resources are used:
The Raspberry Pi 4 Model B is a highly popular embedded board globally, developed by the Raspberry Pi Foundation in collaboration with Broadcom in the United Kingdom.
SD card support | Micro SD card slot for loading operating system and data storage
Input power 5V DC via USB-C connector (minimum 3A1 )
5V DC via GPIO header (minimum 3A1 ) Power over Ethernet (PoE)—enabled
EVALUA TION
Model evaluation
4.1.1 Environment The first model of this system is trained and test in Google Colab environment and run in the Raspberry Pi 4 model B:
- Broadcom BCM2711, quad-core Cortex-A72 (ARM v8) 64- bit SoC @ 1.5GHz
- GHz and 5.0 GHz IEEE 802.11b/g/n/ac wireless LAN, Bluetooth 5.0, BLE
- Standard 40-pin GPIO header 4.1.2 Accuracy and Confusion matrix
To evaluate the effectiveness of the firt layer (emotion detection layer), accuracy and confusion matrix are used.
- Accuracy: this is the simple metric usually used in the classification problem.
To calculate accuracy, use following formula:
COUT ACY = TD 4 TN + FP + FN o TP: true positive o TN: true negative o FP: false positive o EN: false negative
- Confusion matrix: this is the table usually used to describe the performance of a classification model by break down the number of correct and incorrect count values.
- Accuracy and confusion matrix of FER2013 dataset.
Table 4 1: Confusion matrix of model train on FER2013
Accuracy of model train on FER2013
Angry Disgust Fear Happy Neutral Sad Surprise
Fig 4 2: Accuracy of model train on FER2013 The average accuracy of the model for this dataset is 50.458%.
- Accuracy and confusion matrix of model train on FER2013 and Vietnamese dataset.
Angry Disgust Fear Happy Neutral Sad Surprise
Fig 4 3: Vietnamese and FER2013 dataset
Table 4 2: Confusion matrix of model train on Vietnamese and FER2013
Accuracy of model train on Vietnames and
Angry Disgust Fear Happy Neutral Sad Surprise
Reality
The system is implemented in reality, including the following steps:
- After supplying power to raspberry, speakers, lights and camera, the system will be in an active state.
- The camera will recognize facial expressions from the user and output the emotions that need to be adjusted.
- The speaker plays music and the lights adjust the color to match the user's emotions recognized by the camera.
Product Test Scenario
To test the effectiveness of the product, we tested the product in an ideal environment with the accuracy of the product in a 20m2 room environment:
- The system will recognize facial expressions from the user and classify emotions.
- Based on the recognized emotion, the music will be played for 30 seconds and the lights will be adjusted in sync with the music playback time.
- The emotional target will be normal emotions.
SUMMARY
Walidation
5.1.1 Main content Having researched and successfully applied the best open-source libraries such as OpenCV and TensorFlow Lite Besides that, with the knowledge learned from the university environment to reach the original goals as well as built an emotion regulation system based on light and sound, this system can be operated, maintained, and developed well in the future.
5.1.2 The product and application The light-based emotion regulation system has been built in the experiment condition including the following features:
- Perform emotion regulation based on light and sound from the input face.
- Detect facial emotion on Raspberry Pi 4 board with 3-4fps and 64.7% accuracy on 6 basic emotions.
- The lecturer supports and orients the research group enthusiastically.
- Get assistance as well as experiences from predecessors.
- Applying the knowledge in the development of the product.
5.2.2 Disadvantage Because the nature of emotions depends on many factors, the recognition and analysis of facial features of emotions are not easy Here, we can point out some difficult factors for the recognition problem:
- Human emotions are not always shown on the outside.
- Each person will have a different facial expression for the same emotion.
- Factors such as environmental noise significantly affect the recognition efficiency.
- Regarding the limited data, the accuracy of the algorithm is good.
- Complete existing functions and increase the accuracy of the identification system as well as user experience.
- Corporate other emotion recognition mechanisms besides face recognition such as heath beat rate.
- Perform Testing in real life 5.4 Conclusion
Throughout the project implementation, the team effectively utilized the knowledge acquired and the guidance provided by instructors, leading to the successful development of their thesis This experience not only enhanced their skills but also expanded their understanding of the subject matter While the research topic was completed, the team acknowledges certain limitations in their results Moving forward, they are committed to further development and skill enhancement for future projects.
Nina Milosavljevic How Does Light Regulate Mood and Behavioral State?.Published: 12 July 2019.
V Ortiz-Garcia-Cervigon, M.V Sokolova, R Garcớa-Muủoz, A Fernandez- Caballero, Led strips for color- and illumination-based emotion regulation at home, in Ambient Assisted Living Development and Testing of ICT-based Solutions in Real Life Situations, Springer, 2015, pp 277-287.
Levi, G., Hassner, T.: Age and gender classification using convolutional neural net- works In: Proceedings of the Conference on Computer Vision and Pattern Recog- nition (CVPR) Workshops pp 34-42 IEEE (2015)
Bargal, S.A., Barsoum, E., Ferrer, C.C., Zhang, C.: Emotion recognition in the wild from videos using images In: Proceedings of the 18th ACM International Conference on Multimodal Interaction (ICMI) pp 433-436 (2016)
Liu, C., Jiang, W., Wang, M., Tang, T.: Group level audio-video emotion recogni- tion using hybrid networks In: Proceedings of the ACM International Conference on Multimodal Interaction (ICMI) pp 807-812 (2020)
Andrey V Savchenko.: Facial expression and attributes recognition based on multi-task learning of lightweight neural networks.
The extended Cohn-Kanade dataset (CK+) is a comprehensive resource for studying action units and emotion-specific expressions, as presented by Patrick Lucey and colleagues This dataset was introduced at the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, highlighting its significance in advancing research in facial expression analysis The CK+ dataset serves as a vital tool for researchers and practitioners in the fields of computer vision and affective computing.
Michael J Lyons et al “The Japanese female facial expression (JAFFE) database” In: Proceedings of third international conference on automatic face and gesture recognition 1998, pp 14-16.
Maja Pantic et al ““Web-based database for facial expression analysis” In: 2005
IEEE International Conference on Multimedia and Expo (2005), 5 pp.-.
Ian J Goodfellow et al “Challenges in representation learning: A report on three machine learning contests” In: International Conference on Neural Information Processing Springer 2013, pp 117— 124.
Abhinav Dhall et al “From individual to group-level emotion recognition: EmotiW 5.0” In: Pro- ceedings of the 19th ACM international conference on multimodal interaction ACM 2017, pp 524— 528.
Abhinav Dhall et al “Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015” In: ICMI 2015.