HIGH-LEVEL FUZZY LINGUISTIC FEATURES OF FACIAL COMPONENTS IN HUMAN EMOTION RECOGNITION

Emotion is an important element in an interaction since it conveys human perception and response of an event. Unlike verbal words that can be manipulated, emotion is brief, spontaneous and provides more honest information. There are several classes of basic primary human emotions that differ from one another. These classes are happy, sad, fearful, surprised, disgusted, and angry. Meanwhile, a psychologist has developed a set of rules to recognize emotions based on facial expressions. This research aims to develop an artificial intelligent model based on psychological knowledge to recognize emotions by analyzing facial expressions. Moreover, the proposed model has defined high-level fuzzy linguistic features of facial components which distinguish it from existing methods that commonly use low-level image features (e.g. color, intensity, histogram, texture). High-level linguistic features (e.g. opened eyes, wrinkled nose) are better at representing human minds than low-level features which are only understood by machines. The model functions by detecting facial points first to locate important facial components; then extracting geometric facial components


INTRODUCTION
Automatic emotion recognition is an active research focus in the affective computing field.The objective is to design an intelligent agent to recognize human emotions.This topic intersects computer science with psychology disciplines.Many researches have been proposed to address automatic emotion recognition problems using different approaches based on artificial intelligence and machine learning (Kumari, Rajesh, & Pooja, 2015).
In communication, emotion is a means to convey messages through nonverbal signals such as facial expression, prosody, gesture and bodily expression (Pantic et al., 2011).Among these signals, facial expression becomes the central feature of emotion (Ekman, 1992).The problem in emotion recognition is the variability in facial expressions.This is a challenging task since there are various subjective ways for humans to express emotions.A psychologist, Paul Ekman has defined basic emotion as a separate discrete emotion that differs from one another (Ekman, 1992).Basic emotion is a universal and primary emotion.The six categories of basic emotions are happy, sad, angry, disgusted, fearful, and surprised.
Facial expression is the movement of facial muscles in response to a given stimulus (Ekman, Friesen, & Hager, 2002).According to Ekman (2003), there are more than 10000 combinations of facial muscle movements (Ekman, 2003).For example, when surprised, people tend to raise their outer eyebrows with their mouths wide open or when angry, people tend to tighten their lips with their inner eyebrows coming close together.The task of emotion recognition is selecting the most prominent emotions from facial components combination.Existing methods such as Support Vector Machine, Neural Network and Deep Learning function by training large numbers of images and learning from extracted patterns to classify images into corresponding classes (Chen et al., 2012;Tang, Guo, Shen, & Du, 2018).However, this is not representative of what psychologists do in recognizing emotions.Psychologists have defined knowledge about human emotions and this definition is derived from years of studies and observations on various facial expressions across cultures (Ekman & Friesen, 1975).Ekman and Friesen (1975) have developed a set of rules to interpret emotions from facial expressions.This knowledge has been utilized by people who work on emotion recognition such as psychotherapists, interrogators, nurses, lawyers, managers, salesmen, and actors.
In this paper, we propose a model of emotion recognition based on psychological knowledge.Our proposed model is different from existing models which commonly use low-level feature representation (e.g.color intensity, histogram, or texture values).Instead, we define high-level fuzzy linguistic features of facial components (e.g.opened mouth, closed eyes, wrinkled nose) as image features.This novel feature is understood by the human mind, since it uses human natural language representation.Moreover, the nature of fuzzy system in dealing with vague problems, as well as its ability to work with ambiguous data is suitable to be applied in emotion recognition tasks which has varying degrees and different intensities of facial expressions as input.
The proposed model works as follows.First, we elicit psychological knowledge of emotions from facial component analysis and denote it as a set of emotion rules.For each input image, the model detects facial points and locates the coordinates of facial components (eyebrows, eyes, nose, and lips).Next, the model extracts geometric face feature parameters and applies these parameters in the fuzzy facial component inference system, resulting in high-level linguistic features of facial components.For the last step, we feed the respected fuzzy linguistic parameters into a fuzzy emotion recognition system which classifies the facial image into basic emotion classes based on psychological rules.
Our contribution in the proposed model is adopting the natural way to recognize emotions automatically using psychological knowledge which focuses on facial component rules.Moreover, we enhance the feature extraction process by using high-level linguistic features of facial components which is simple to compute yet powerful as facial expression descriptors.
Our model is useful in various fields, such as psychology, health, education, robotics, and entertainment.The next section explains related studies on emotion recognition and analysis.We will discuss our proposed model in the methodology section and elaborate our findings from experiments in the results and discussion section.In the last section we provide our conclusion and suggest future work.

RELATED WORK
Research on facial component analysis is based on the assumption that by decomposing the human face into facial components it enables the exploration of local facial features, instead of processing the whole face directly or using global features (Li, Lian, & Lu, 2012).There are three types of feature processing techniques which are based on template model, mathematical model, and deep learning methods.The first technique uses template matching to increase facial points detection accuracy via Active Appearance Model (AAM) framework (Pratiwi, Widyanto, Basaruddin, & Liliana, 2017;Wang, Li, & Wang, 2014).The second technique is applied by calculating feature values using a mathematical model (Loconsole, Miranda, Augusto, Frisoli, & Orvalho, 2014).A recent trend is using deep learning methods for feature processing (Das & Chakrabarty, 2016;Pitaloka, Wulandari, Basaruddin, & Liliana, 2017), but it requires a large amount of data for the training process; and since deep learning is a black-box method, we cannot observe the process inside it.
A large number of approaches on facial expression analysis are based on appearance and geometric features.Appearance feature is related to facial texture variations (e.g.furrows, wrinkles).This feature can be gained using filtering techniques such as Gabor filter which has high accuracy performance under different illuminations, poses, and expressions (Sudhakar & Nithyanandam, 2017); Principle Component Analysis (PCA) to reduce feature dimensionality (Chakrabarti & Dutta, 2013); and Local Binary Pattern (LBP) to extract various types of textures on face images (Lekdioui, Messoussi, Ruichek, Chaabi, & Touahni, 2017).Another appearance-based feature extracts skin color for facial component classification which works fast on a pixel level (Mayer, Wimmer, & Radig, 2010).Generally, obtaining appearance feature descriptors require more computational cost, in terms of memory and time.
Geometric features describe the geometric shape of facial components.It also represents facial component movements.Geometric features are known for its robustness in handling expression variability (Sadeghi, Raie, & Mohammadi, 2013).Geometric features from facial components are extracted from other meaningful descriptors by using a mathematical model (Loconsole et al., 2014) or through deep learning methods (Tang, Guo, Shen, & Du, 2018).Loconsole et al. (2014) used geometric features extraction based on pixel coordinates and transformed it into a linear feature representation in a simple and fast way.Their work has inspired this proposed model; hence we explored more aspects on geometric features to enhance facial components analysis performance.Geometric based features extraction employed by other researchers include Nicolai and Choi (2015) who used threshold values for facial components and Chaturvedi and Tripathi (2014) used the Euclidean distance between facial components as a geometric features descriptor.Other approaches combined geometric and appearance features and gained advantage from both sides but suffered from memory and time computation as a consequence (Chen et al., 2012;Sadeghi et al., 2013).
In geometric-based features extraction, facial landmarks become a crucial starting point.Therefore, facial points detection must be performed with high accuracy results.Active Appearance Model (AAM) is a robust framework for facial points detection which works on face images with varying positions and scales (Cootes, Edwards, & Taylor, 2001).AAM is a template-based model which combines shape and texture features to locate facial fiducial points (Wang et al., 2014).We applied AAM as an intermediate system to detect facial points (Liliana, Widyanto, & Basaruddin, 2017).We also utilized AAM framework as an initial step for our fuzzy facial components analysis.
In contrast to existing fuzzy approaches for emotion recognition, our proposed model uses a high-level fuzzy linguistic feature of facial components as input for determining emotion class as output.Hence, we do not force mapping of six basic emotions in a single dimensional output unlike other studies (Chaturvedi & Tripathi, 2014;Halder, Bhattacharjee, Nasipuri, Basu, & Kundu, 2010;Sujono & Gunawan, 2015).Instead we used each emotion class as a separate output.In addition to our proposed model, we employed simple geometric features extraction methods which worked on a pixel basis to enhance the features extraction process.These geometric features served as input for the fuzzy facial component analysis.We defined a separate fuzzy rule-based model for each facial component which takes geometric features values as input and performs high-level features extraction to produce highlevel linguistic features of facial components as output.

FUZZY FACIAL COMPONENTS ANALYSIS FOR EMOTION RECOGNITION
Generally, there are three main steps in emotion recognition (Kumari et al., 2015).The first step is detecting face among other objects and locating important facial fiducial points.The second step is extracting facial features that become facial expression descriptors using specific methods.The last step is performing classification by using facial features as input to recognize emotions.For the first step, Ekman (2003) has created a face atlas of facial components which consists of three parts of the face: upper part (eyebrows and forehead); center part (eyes and nose); and lower part (cheek, mouth, and chin).These components are involved in the change in facial expressions.For the second step, we utilized geometric facial features to produce high-level fuzzy linguistic features of facial components.In the last step, we performed fuzzy rule-based emotion classification.
We have developed a model of emotion recognition which consists of several processing steps.Initially, the input is a static face image under conditions of a single subject and frontal view without occlusion.Each subsequent image is processed in the following manner.First, the face region is detected, resulting in facial points.The next geometric facial components features extraction step is transforming facial points into facial components features using a mathematical model and resulting in geometric facial features parameters.After that, fuzzy facial components inference system is performed to generate a set of linguistic conditions of facial components as well as parameter values as input for the emotion recognition subsystem.Lastly, the fuzzy emotion inference system determines the emotion result.The proposed emotion recognition is illustrated in Figure 1.

FUZZY FACIAL COMPONENTS ANALYSIS FOR EMOTION RECOGNITION
Generally, there are three main steps in emotion recognition (Kumari et al., 2015).The first step is detecting face among other objects and locating important facial fiducial points.The second step is extracting facial features that become facial expression descriptors using specific methods.The last step is performing classification by using facial features as input to recognize emotions.For the first step, Ekman (2003) has created a face atlas of facial components which consists of three parts of the face: upper part (eyebrows and forehead); center part (eyes and nose); and lower part (cheek, mouth, and chin).These components are involved in the change in facial expressions.For the second step, we utilized geometric facial features to produce high-level fuzzy linguistic features of facial components.
In the last step, we performed fuzzy rule-based emotion classification.
We have developed a model of emotion recognition which consists of several processing steps.
Initially, the input is a static face image under conditions of a single subject and frontal view without occlusion.Each subsequent image is processed in the following manner.First, the face region is detected, resulting in facial points.The next geometric facial components features extraction step is transforming facial points into facial components features using a mathematical model and resulting in geometric facial features parameters.After that, fuzzy facial components inference system is performed to generate a set of linguistic conditions of facial components as well as parameter values as input for the emotion recognition subsystem.Lastly, the fuzzy emotion inference system determines the emotion result.The proposed emotion recognition is illustrated in Figure 1.
Inference System (FFCIS); and Fuzzy Emotion Inference System (FEIS).The rectangles inside the dashed-lines represent the features extraction phase.The process converts low-level image features (geometric features) into high-level fuzzy linguistic features using FFCIS.We divided our proposed model into five subsections.Subsection one discusses facial components analysis.Subsection two talks about facial point detection techniques.Subsection three discusses the geometric facial components features extraction process.Subsection four talks about the fuzzy facial components inference system.The last subsection discusses the emotion recognition system.

Facial Components Analysis
Initially, the face area is detected by performing Active Appearance Model (AAM) which works rapidly in locating the face region and marking facial points (FPs) (Cootes et al., 2001).FPs which indicate the location of facial landmarks are extracted using our proposed geometric features extraction methods.Only FPs located on facial components are used and this makes our method work fast.As a result, a set of geometric features parameters is extracted from 10 facial components.Table 1 describes the geometric facial features for the corresponding facial components (FC).Geometric facial components features parameters are fed into the fuzzy facial components inference system (FFCIS).FFCIS consists of 10 fuzzy rulebased engines, each engine for each facial component.The result of FFCIS is a set of fuzzy linguistic features of facial components.These fuzzy linguistic features become parameters for the fuzzy emotion recognition, resulting in emotion classification.Each facial component has a different number of feature descriptors, depending on the shape of the facial component.We utilized 10 facial components and labeled them as FCi as in Table 1, where i refers to the facial component number.We defined geometric features for each component based on geometric shape and characteristics of facial components.A total of 20 geometric features (GF) of facial components were designed and denoted as FCiGFj, where i refers to the facial component number and j refers to the order of the facial component geometric feature.Basically, there are two types of GF which are distance ratio and ellipse eccentricity.We will explain more about these geometric features extraction methods in the next subsection.The last column of Table 1 is the output of fuzzy facial components analysis, denoted as yFCi, where i represents the corresponding FC number.

Facial Points Detection
AAM is utilized to perform facial points detection.AAM is effective and flexible for object tracking, especially for facial landmarks detection (Wang et al., 2014).We applied fast AAM framework by (Tzimiropoulos & Pantic, 2013) for facial points detection.Figure 2 The AAM generated 68 facial points; each of them had information about the coordinates of its position.The next step after FPs detection was geometric features extraction.We used FPs pixel coordinates as input and processed it through a simple calculation method which utilized x-axis and y-axis values of FPs and transformed them into fuzzy facial components values.

Geometric facial components features extraction
We registered all FPs coordinates of their positions but excluded some FPs which were located on the jaw area in the features extraction process because these FPs did not produce rapid signals for geometric features.We processed each facial component separately using the respective FPs as explained in the previous subsection.Unlike other studies, we did not use any reference template for features extraction.Generally, we have proposed two types of geometric features: ellipse eccentricity and distance ratio.The first feature comes from the idea that several facial components have an elliptic or half-elliptic shape (e.g.eyes, eyebrows, lips, and mouth).We utilized the eccentricity parameter of ellipse as the first type of geometric features.Eccentricity represents the elliptical curvature level.This value ranges from zero to one, where zero value indicates a circle.The larger the value, the more elliptical is the object.We related this to the facial component case; wide-opened eyes or wide-opened mouth implies a smaller eccentricity value, and vice versa.The second type of geometric feature is distance ratio.We used ratio since it is independent of the measurement unit and it indicates the distance in comparison with the height or width of facial components.
FC 1 and FC 2 have the same geometric features as mentioned in Table 1.Thus, feature formulation is the same for both FCs.

Geometric facial components features extraction
We registered all FPs coordinates of their positions but excluded some FPs which were located on the jaw area in the features extraction process because these FPs did not produce rapid signals for geometric features.We processed each facial component separately using the respective FPs as explained in the previous subsection.Unlike other studies, we did not use any reference template for features extraction.Generally, we have proposed two types of geometric features: ellipse eccentricity and distance ratio.The first feature comes from the idea that several facial components have an elliptic or half-elliptic shape (e.g.eyes, eyebrows, lips, and mouth).We utilized the eccentricity parameter of ellipse as the first type of geometric features.Eccentricity represents the elliptical curvature level.This value ranges from zero to one, where zero value indicates a circle.The larger the value, the more elliptical is the object.We related this to the facial component case; wide-opened eyes or wide-opened mouth implies a smaller eccentricity value, and vice versa.The second type of geometric feature is distance ratio.We used ratio since it is independent of the measurement unit and it indicates the distance in comparison with the height or width of facial components.
FC 1 and FC 2 have the same geometric features as mentioned in Table 1.Thus, feature formulation is the same for both FCs. Figure 3(a) shows FC 1 and FC 2 , and Figure 3  The first feature of FC 1 and FC 2 is eyebrows eccentricity.Based on this study, we assume that FC 1 and FC 2 are half-ellipse shape objects (Loconsole et al., 2014), thus we used ellipse eccentricity property to extract FC 1 and FC 2 geometric features.FC 1 has three parameters a, b, and c, and so has FC 2 ; where b is the height of FC 1 obtained by subtracting FC 1 y-axis maximum value with FC 1 y-axis minimum value; a is the half width of FC 1 , obtained by subtracting the x-axis value of FC 1 outermost FP to the x-axis value of FC 1 innermost FP.We calculated the eccentricity parameter using Equation 1. (1) where FC 1 GF 1 is the first geometric feature of FC 1 , which refers to the eccentricity feature.We also applied Equation 1for FCs which used eccentricity features, as listed in Table 1.The same calculation process was applied to FC 2 GF 1 using Equation 1for the right eyebrow.
The second feature of FC 1 GF 2 and FC 2 GF 2 is the height ratio which is defined as the ratio of eyebrow height to face height which is obtained using Equation 2, where FC 1 GF 2 is the second geometric feature of FC 1 , which refers to the height ratio between, eyebrow height to face height denoted by F height using Equation 2.
(2) The first feature of FC 1 and FC 2 is eyebrows eccentricity.Based on this study, we assume that FC 1 and FC 2 are half-ellipse shape objects (Loconsole et al., 2014), thus we used ellipse eccentricity property to extract FC 1 and FC 2 geometric features.FC 1 has three parameters a, b, and c, and so has FC 2 ; where b is the height of FC 1 obtained by subtracting FC 1 y-axis maximum value with FC 1 y-axis minimum value; a is the half width of FC 1 , obtained by subtracting the x-axis value of FC 1 outermost FP to the x-axis value of FC 1 innermost FP.We calculated the eccentricity parameter using Equation (1). , where FC 1 GF 1 is the first geometric feature of FC 1 , which refers to the eccentricity feature.We also applied Eq. ( 1) for FCs which used eccentricity features, as listed in Table 1.The same calculation process was applied to FC 2 GF 1 using Equation (1) for the right eyebrow.
The second feature of FC 1 GF 2 and FC 2 GF 2 is the height ratio which is defined as the ratio of eyebrow height to face height which is obtained using Eq. ( 2), where FC 1 GF 2 is the second geometric feature of FC 1 , which refers to the height ratio between, eyebrow height to face height denoted by F height using Equation (2). .
FC 3 geometric features are distance ratio and height ratio.Distance ratio of inner eyebrow (FC 3 ) in  1 and FC 2 is eyebrows eccentricity.Based on this study, we assume that FC 1 se shape objects (Loconsole et al., 2014), thus we used ellipse eccentricity 1 and FC 2 geometric features.FC 1 has three parameters a, b, and c, and so has ght of FC 1 obtained by subtracting FC 1 y-axis maximum value with FC 1 y-axis e half width of FC 1 , obtained by subtracting the x-axis value of FC 1 outermost of FC 1 innermost FP.We calculated the eccentricity parameter using Equation , irst geometric feature of FC 1 , which refers to the eccentricity feature.We also s which used eccentricity features, as listed in Table 1.The same calculation FC 2 GF 1 using Equation (1) for the right eyebrow.
FC 1 GF 2 and FC 2 GF 2 is the height ratio which is defined as the ratio of eyebrow hich is obtained using Eq. ( 2), where FC 1 GF 2 is the second geometric feature the height ratio between, eyebrow height to face height denoted by F height using .  1 and FC 2 is eyebrows eccentricity.Based on this study, we assume that FC 1 se shape objects (Loconsole et al., 2014), thus we used ellipse eccentricity 1 and FC 2 geometric features.FC 1 has three parameters a, b, and c, and so has ght of FC 1 obtained by subtracting FC 1 y-axis maximum value with FC 1 y-axis e half width of FC 1 , obtained by subtracting the x-axis value of FC 1 outermost of FC 1 innermost FP.We calculated the eccentricity parameter using Equation , irst geometric feature of FC 1 , which refers to the eccentricity feature.We also s which used eccentricity features, as listed in Table 1.The same calculation FC 2 GF 1 using Equation (1) for the right eyebrow.
C 1 GF 2 and FC 2 GF 2 is the height ratio which is defined as the ratio of eyebrow hich is obtained using Eq. ( 2), where FC 1 GF 2 is the second geometric feature the height ratio between, eyebrow height to face height denoted by F height using .(3) FC 3 GF 1 is the distance ratio between FC 3 with face width denoted by F width , where w is the distance between the left and right inner eyebrow.Meanwhile, the height ratio of FC 3 is calculated using Equation 4. ( where FC 3 GF 2 is the height ratio of FC 3 and h is the distance between the inner FP to the center of the eye.Eyes have an elliptic shape, so we assume that geometric feature for eyes is ellipse eccentricity.Two geometric features for FC 4 and FC 5 are eyes eccentricity and eyes opening ratio.Equation 1 is applied to gain FC 4 GF 1 and FC 5 GF 1 , where a is the half width of FC 4 or FC 5 and b is the half height of FC 4 or FC 5 .For the second feature, we applied Equation 5 to obtain the opening ratio FC 4 GF 2 and FC 5 GF 2 .
ures are distance ratio and height ratio.Distance ratio of inner eyebrow (FC 3 ) in lated in Equation (3). , nce ratio between FC 3 with face width denoted by F width , where w is the distance d right inner eyebrow.Meanwhile, the height ratio of FC 3 is calculated using , e height ratio of FC 3 and h is the distance between the inner FP to the center of the he left eye (FC 4 ) and right eye (FC 5 ) and their FPs.
ic shape, so we assume that geometric feature for eyes is ellipse eccentricity.Two for FC 4 and FC 5 are eyes eccentricity and eyes opening ratio.Eq. ( 1) is applied to , where a is the half width of FC 4 or FC 5 and b is the half height of FC 4 or feature, we applied Equation ( 5) to obtain the opening ratio FC 4 GF 2 and FC 5 GF 2 .
. Equation (4). , where FC 3 GF 2 is the height ratio of FC 3 and h is the distance between the inner FP to the center of the eye.
Figure 4(a) shows the left eye (FC 4 ) and right eye (FC 5 ) and their FPs.
Eyes have an elliptic shape, so we assume that geometric feature for eyes is ellipse eccentricity.Two geometric features for FC 4 and FC 5 are eyes eccentricity and eyes opening ratio.Eq. ( 1) is applied to gain FC 4 GF 1 and FC 5 GF 1 , where a is the half width of FC 4 or FC 5 and b is the half height of FC 4 o FC 5 .For the second feature, we applied Equation ( 5) to obtain the opening ratio FC 4 GF 2 and FC 5 GF 2 .
.  (5) Figure 4(b) shows nose facial component (FC 6 ) with two geometric features: wrinkle ratio and height ratio.Wrinkle ratio FC 6 GF 1 indicates how shrinking the nose is measured using Equation 6. ( where x is FC 6 width and y is FC 6 height.The larger the FC 6 GF 1 value, the more wrinkled the nose, and vice versa.Height ratio FC 6 GF 2 is obtained using Equation 7. ( where h is the distance between the top FP of the eyebrow and the lowest FP of the nose. Figure 5 shows FC 7 , FC 8 , FC 9 , and FC 10 .The four FCs use eccentricity as the first geometric feature and labeled as FC i GF 1 , where i= 7,…,10 refer to the FC numbers.hape, so we assume that geometric feature for eyes is ellipse eccentricity.Two FC 4 and FC 5 are eyes eccentricity and eyes opening ratio.Eq. ( 1) is applied to F 1 , where a is the half width of FC 4 or FC 5 and b is the half height of FC 4 or ature, we applied Equation ( 5) to obtain the opening ratio FC 4 GF 2 and FC 5 GF 2 .
. (5) e facial component (FC 6 ) with two geometric features: wrinkle ratio and height 6 GF 1 indicates how shrinking the nose is measured using Equation ( 6). , nd y is FC 6 height.The larger the FC 6 GF 1 value, the more wrinkled the nose, ratio FC 6 GF 2 is obtained using Equation ( 7). , between the top FP of the eyebrow and the lowest FP of the nose.
C 8 , FC 9 , and FC 10 .The four FCs use eccentricity as the first geometric feature , where i= 7,…,10 refer to the FC numbers.
10 FC 4 and FC 5 are eyes eccentricity and eyes opening ratio.Eq. ( 1) is applied to F 1 , where a is the half width of FC 4 or FC 5 and b is the half height of FC 4 or ture, we applied Equation ( 5) to obtain the opening ratio FC 4 GF 2 and FC 5 GF 2 . .
facial component (FC 6 ) with two geometric features: wrinkle ratio and height 6 GF 1 indicates how shrinking the nose is measured using Equation ( 6). , nd y is FC 6 height.The larger the FC 6 GF 1 value, the more wrinkled the nose, ratio FC 6 GF 2 is obtained using Equation ( 7). , between the top FP of the eyebrow and the lowest FP of the nose.
C 8 , FC 9 , and FC 10 .The four FCs use eccentricity as the first geometric feature where i= 7,…,10 refer to the FC numbers.

10
FC 4 and FC 5 are eyes eccentricity and eyes opening ratio.Eq. ( 1) is applied to F 1 , where a is the half width of FC 4 or FC 5 and b is the half height of FC 4 or ature, we applied Equation ( 5) to obtain the opening ratio FC 4 GF 2 and FC 5 GF 2 . .
e facial component (FC 6 ) with two geometric features: wrinkle ratio and height 6 GF 1 indicates how shrinking the nose is measured using Equation ( 6). , nd y is FC 6 height.The larger the FC 6 GF 1 value, the more wrinkled the nose, ratio FC 6 GF 2 is obtained using Equation ( 7). , between the top FP of the eyebrow and the lowest FP of the nose.
C 8 , FC 9 , and FC 10 .The four FCs use eccentricity as the first geometric feature , where i= 7,…,10 refer to the FC numbers.
We applied Eq. (1) to obtain these feature values.The second geometric feature for FC 7 and FC 8 is the thickness ratio, while for FC 9 and FC 10 is the opening ratio.These values are obtained using Equation We applied Equation 1 to obtain these feature values.The second geometric feature for FC 7 and FC 8 is the thickness ratio, while for FC 9 and FC 10 is the opening ratio.These values are obtained using Equation 5. We obtained geometric feature values for all facial components after the features extraction process.The next step is fuzzifying these facial components values and entering them into the Fuzzy Facial Components Inference System (FFCIS).

Fuzzy Facial Components Inference System (FFCIS)
We have produced the relevant geometric feature parameters for fuzzy facial component analysis in the previous step.Using resultant parameters as input, we designed FFCIS for each facial component and utilized the Fuzzy Mamdani Inference method (Mamdani, 1974).Thus, from this phase we obtained 10 results of fuzzy facial components parameters yFCi, where i = 1,…,10 is the number of facial components.The design of an FFCIS is shown in Figure 6.Two inputs for each FFCIS: FC i GF 1 and FC i GF 2 refer to the first geometric feature of i-th FC and the second geometric feature of i-th FC, respectively.A set of rules are mapping input into output using the Fuzzy Mamdani inference method.The output is the facial components parameter yFC i .In this study, psychological knowledge is involved in determining fuzzy parameters.Three factors that need to be considered in fuzzy parameters definition are the: type of membership function, number of membership Rule evaluation (inference)

Rule Composition
Output: yFC i Defuzzification functions in each fuzzy variable, and interval value for each linguistic variable.The first factor is determined based on the characteristics of the problem related to fuzzy variables.Triangular membership function is widely used because it is simple and yields a good result (Chaturvedi & Tripathi, 2014).The second factor is the number of membership function for each fuzzy variable which is determined based on psychological knowledge, regarding various states of facial components linguistic conditions.While the last factor, or the interval value of linguistic variables is obtained through experiments on various facial expression images.
The flowchart in Figure 6 describes the FFCIS process.Four main steps are input fuzzification, rule evaluation, rule composition, and defuzzification.In the first step, each input linguistic variable is defined as low, medium, and high.Geometric feature values are fuzzified into fuzzy values using triangular membership functions.Fuzzy triangular membership function is a function which maps a crisp value in the x-axis into a fuzzy value in the y-axis.It uses three parameters α1, α2, α3 of a triangular curve for fuzzification of any input x using Equation 8. (8) where f(x) is the degree of membership or fuzzy value; α 1 , α 2 , α 3 are left, middle, and right parameters of fuzzy triangular membership function.For output, the membership function is a triangular curve, while the linguistic variable depends on the type of facial components.Table 2   As we can see in Table 2, our fuzzy facial components model has 10 separated FFCIS and 10 output parameters.Each output has different fuzzy linguistic variables.For example, the eyes have three linguistic variables: narrow, normal, and wide; while the mouth has three different linguistic variables: tight, normal, and wide.The output linguistic variables depend on facial components traits and psychological knowledge.The interval value is from zero to one.
The next step is rule evaluation.Each FFCIS owns separate rules.A set of rules has been stored in each FFCIS' inference engine.We enumerated all possible input linguistic conditions at the antecedent part and related it with output at the consequence part based on psychological knowledge.As an example, the left eyebrow FFCIS has nine rules.One of its rules is: IF eccentricity is low AND distance ratio is low THEN yFC 1 is lower.
Rule composition uses fuzzified input to evaluate FFCIS rules.This step yields an area of rule aggregation which matches with the input conditions.The last step, defuzzification is applied to the output area using centroid of gravity method to achieve the result of yFC i value.We implemented our FFCIS model using MATLAB 2014 software.The results were a set of fuzzy facial components (FFC) parameters, yFC i .Next, these output FFC parameters became input and proceeded to the emotion recognition inference system.Figure 7 shows the example of FFCIS membership functions for FC 1 with (a) eccentricity input; (b) distance ratio input; and (c) yFC 1 output.

Emotion recognition
Emotion recognition is the last process in our proposed model.The objective is to classify facial expression into six basic emotion classes (happy, sad, angry, disgusted, fearful, and surprised).We designed an inference engine for each basic emotion; six Fuzzy Inference Systems (FIS) using Sugeno method for six basic emotion classes.The output from each FIS is a value between 0 (low intensity) and 1 (high intensity).This value represents the degree of emotion displayed on a face image.We took the highest emotion FIS output value as the result of emotion classification.
The Fuzzy emotion process is the same as the FFCIS process.A vector of facial components parameters yFCi becomes the input of Fuzzy emotion.Each Fuzzy emotion has its own rules which are stored in an inference engine.The IF-THEN rules are constructed from the facial components linguistic condition as the antecedent, and the emotion as the consequent.An example of emotion rule for surprised is: IF yFC 1 is raise AND yFC 2 is raise AND yFC 4 is wide AND yFC 5 is wide AND yFC 7 is thick AND yFC 8 is normal AND yFC 10 THEN Surprised Emotion is 1 (high).
The rule evaluation is the process of scanning relevant rules for some input linguistic conditions.Rules which are triggered by the input are 14

Emotion recognition
Emotion recognition is the last process in our proposed model.The objective is to classify facial expression into six basic emotion classes (happy, sad, angry, disgusted, fearful, and surprised).We designed an inference engine for each basic emotion; six Fuzzy Inference Systems (FIS) using Sugeno method for six basic emotion classes.The output from each FIS is a value between 0 (low intensity) and 1 (high intensity).This value represents the degree of emotion displayed on a face image.We took the highest emotion FIS output value as the result of emotion classification.
The Fuzzy emotion process is the same as the FFCIS process.A vector of facial components parameters yFCi becomes the input of Fuzzy emotion.Each Fuzzy emotion has its own rules which are stored in an inference engine.The IF-THEN rules are constructed from the facial components linguistic condition as the antecedent, and the emotion as the consequent.An example of emotion rule for surprised is: IF yFC 1 is raise AND yFC 2 is raise AND yFC 4 is wide AND yFC 5 is wide AND yFC 7 is thick AND yFC 8 is normal AND yFC 10 THEN Surprised Emotion is 1 (high).
The rule evaluation is the process of scanning relevant rules for some input linguistic conditions.
Rules which are triggered by the input are composed to build the output area.The last step is the composed to build the output area.The last step is the defuzzification process using weighted average sum method which determines the degree of emotion value as in Equation 9. (9) where Z is the defuzzification result or the intensity of fuzzy emotion, α is the membership degree of input, and z is a real input value on x-axis.The highest Z value determines the emotion classification result.

EXPERIMENTAL RESULTS
We carried out an experiment to investigate the performance of our proposed model.We monitored two important aspects: facial component linguistic analysis and emotion recognition performances.The first aspect was gained by testing our proposed fuzzy facial components inference system accuracy using CK+ facial expression dataset (Lucey et al., 2010).The second aspect was obtained by testing the proposed emotion recognition system using four different datasets: CK++, JAFFE, DISFA, and our own facial expressions dataset.We also compared our model to other recognition methods to observe performance using Fuzzy C-Means (FCM) clustering, Fuzzy Inference System (FIS), and Support Vector Machine (SVM).
In the first experiment, we tested the performance of our proposed fuzzy facial components inference system by observing the output; a set of linguistic conditions related to an input image which describes states of the facial components.The objective was to measure the correct facial components linguistic conditions.An example of the correct FFCIS output is given in Figure 1 as input: Left eyebrow is normal; right eyebrow is normal; inner eyebrow is normal; left eye is normal; right eye is normal; nose is normal; upper lip is thin; lower lip is thin; inner mouth is tight; outer mouth is normal.
The first experimental result is summarized in Table 3.The first column is emotion classes, while the rest of the columns are the normalized value of correct facial components identification.The last row shows the percentage of correct identification of facial components.cess using weighted average sum method which determines the degree of emotion , * zzification result or the intensity of fuzzy emotion, α is the membership degree of l input value on x-axis.The highest Z value determines the emotion classification

EXPERIMENTAL RESULTS
experiment to investigate the performance of our proposed model.We monitored cts: facial component linguistic analysis and emotion recognition performances.
as gained by testing our proposed fuzzy facial components inference system + facial expression dataset (Lucey et al., 2010).The second aspect was obtained by d emotion recognition system using four different datasets: CK++, JAFFE, DISFA, expressions dataset.We also compared our model to other recognition methods to ce using Fuzzy C-Means (FCM) clustering, Fuzzy Inference System (FIS), and chine (SVM).ent, we tested the performance of our proposed fuzzy facial components inference ng the output; a set of linguistic conditions related to an input image which the facial components.The objective was to measure the correct facial components s.An example of the correct FFCIS output is given in Figure 1 as input: rmal; right eyebrow is normal; inner eyebrow is normal; left eye is normal; right is normal; upper lip is thin; lower lip is thin; inner mouth is tight; outer mouth is tal result is summarized in Table 3.The first column is emotion classes, while the are the normalized value of correct facial components identification.The last row ge of correct identification of facial components.We analyzed the output by comparing facial component linguistic features output with the input image states.We ran the experiment on CK+ dataset consisting of 238 annotated images of six basic emotions: 45 angry, 50 disgusted, 25 fearful, 50 happy, 23 sad and 45 surprised.From Table 3 we can see that the highest identification results is FC3 or inner eyebrow (99.63%) and the lowest identification result is FC6 or nose (95.96%).The average accuracy rate and standard deviation of facial components analysis are 98.15% and 1.07, respectively.Table 4. shows the example of facial components linguistic features identification results.Six examples for each basic emotion input image describes the high-level linguistic features of facial components in human natural language generated by the system.The blue dots are the AAM facial points detection results.We validated these linguistic conditions based on psychological expert judgment.
The second experiment observed the performance of our proposed model using several facial expression datasets.The objective was to measure the recognition accuracy of our proposed model on different datasets.Figure 8 shows images and labels of six basic emotions recognition results using different facial expression datasets.
Our proposed model selected the highest intensity value amongst six emotion intensity values as a result of fuzzy emotion inference system and classified the input images using the six basic emotions.We used four datasets: CK+ (238 images), JAFFE (183 images), DISFA (149 images), and our own facial expression dataset, Indonesian Mixed Emotion Dataset or IMED (270 images) to test emotion recognition performance.The dataset size is 640x490 pixels for CK+; 256x256 pixels for JAFFE; 1280x720 pixels for DISFA; and 720x480 pixels for IMED.Each image contains a single face with frontal orientation.The second experiment observed the performance of our proposed model using several facial expression datasets.The objective was to measure the recognition accuracy of our proposed model on different datasets.Figure 8 shows images and labels of six basic emotions recognition results using different facial expression datasets.The second experiment observed the performance of our proposed model using several facial expression datasets.The objective was to measure the recognition accuracy of our proposed model on different datasets.Figure 8 shows images and labels of six basic emotions recognition results using different facial expression datasets.17 The second experiment observed the performance of our proposed model using several facial expression datasets.The objective was to measure the recognition accuracy of our proposed model on different datasets.Figure 8 shows images and labels of six basic emotions recognition results using different facial expression datasets.The second experiment observed the performance of our proposed model using several facial expression datasets.The objective was to measure the recognition accuracy of our proposed model on different datasets.Figure 8 shows images and labels of six basic emotions recognition results using different facial expression datasets.17 The second experiment observed the performance of our proposed model using several facial expression datasets.The objective was to measure the recognition accuracy of our proposed model on different datasets.Figure 8 shows images and labels of six basic emotions recognition results using different facial expression datasets.17 The second experiment observed the performance of our proposed model using several facial expression datasets.The objective was to measure the recognition accuracy of our proposed model on different datasets.Figure 8 shows images and labels of six basic emotions recognition results using different facial expression datasets.Table 4 Confusion matrices using different datasets   (continued) We can see from Table 4 precision and recall scores for each emotion class.Precision indicates the prediction correctness, while recall indicates how correct the recognition result is to the actual data.Table 4(a) is the CK+ confusion matrix and the highest precision score is 0.98 for happy.This means that 49 images from the happy class were correctly classified as happy by the system, with only one image misclassified as surprised.The lowest precision score is 0.91 for sad, with two misclassified images: angry and disgusted.The highest recall score in Table 4 is 1 for happy.This implies that none of the images were misclassified as happy.While the lowest recall score is sad, with one image of disgusted and two images of fearful misclassified as sad.Table 4(b) is the confusion matrix for the JAFFE dataset.It shows that the highest precision score is 0.93 for surprised and the lowest precision score is 0.81 for sad.Meanwhile, the highest recall score is 1 for disgusted and the lowest recall score is 0.69 for angry.Recall score is low for angry in JAFFE because some images from the other classes besides happy are misclassified as angry.Table 4(c) is the confusion matrix for DISFA dataset.It shows that the highest precision score is 0.93 for disgusted and the lowest precision score is 0.85 for angry.Meanwhile, the highest recall score is 0.97 for disgusted and the lowest recall score is 0.81 for angry and happy.Table 4(d) is the confusion matrix for IMED dataset.It shows that the highest precision score is 0.91 for surprised and the lowest precision score is 0.87 for angry and fearful.The highest recall score is 1 for surprised and the lowest recall score is 0.80 for happy.
We summarized confusion matrices in Table 5 and displayed the results using classification measurement tools: average precision, average recall, average accuracy, and F1-score of multiclass to measure recognition performance.The result is displayed in Table 5.We can see from each row that the performance of our proposed model is consistent and high, indicated by the uniform values of average precision, recall, accuracy and F1 score in each class.
The last experiment compared the performance of the proposed model with other classifiers: Fuzzy C-Means (FCM), Fuzzy Inference System (FIS), and Support Vector Machines (SVM).The experiment used CK+ dataset (238 images) and classified six classes of basic emotions.A comparison of the results is shown in Table 6.Table 6 shows that our proposed model obtained the highest accuracy, precision, recall, and F1-scores; followed by SVM, FCM, and lastly, FIS.
Figure 9 (a) is a graphic representation of the results in Table 5, similarly, the graph in Figure 9(b) is a graphic representation of the results in Table 6.The bar graph represents different measurement tools (accuracy, precision, recall, and F1-score) in Figure 9

DISCUSSION
From the results of the experiment we can observe two aspects: the performance of facial components linguistic analysis and the performance of emotion recognition using our proposed fuzzy linguistic facial features and fuzzy emotion inference system.The first aspect has been reviewed in Table 3, given that the average accuracy and standard deviation of facial components identification is 98.15%±1.07.This high value is reached because the geometric features extraction has given the best feature descriptors to the FFCIS subsystem.FFCIS rule-based is also powerful in analyzing facial components linguistic conditions.Here we demonstrated that the knowledge developed from psychological rules is the determining factor, resulting in the correct identification of facial components linguistic conditions.The second aspect is the emotion recognition performance.Our proposed model gained a satisfactory recognition result with an accuracy rate of 0.958 surpassing other classifiers.The strength of our proposed model in comparison to other methods is that we applied a fuzzy facial components inference and fuzzy emotion inference, where knowledge is stored in the system.Thus, we did not require any training data for the classification process, unlike FCM and SVM.The difference between the proposed model and the other three classifiers (FCM, FIS, and SVM) is in the image features.We used high-level linguistic facial components features while other classifiers processed whole face images as input.In our model, we processed only important facial points and avoided processing whole images which did not contribute emotion signals in certain parts.Thus, we accelerated the features extraction process by using geometric facial components features.

DISCUSSION
From the results of the experiment we can observe two aspects: the performance of facial components linguistic analysis and the performance of emotion recognition using our proposed fuzzy linguistic facial features and fuzzy emotion inference system.The first aspect has been reviewed in Table 3, given that the average accuracy and standard deviation of facial components identification is 98.15%±1.07.This high value is reached because the geometric features extraction has given the best feature descriptors to the FFCIS subsystem.FFCIS rule-based is also powerful in analyzing facial components linguistic conditions.Here we demonstrated that the knowledge developed from psychological rules is the determining factor, resulting in the correct identification of facial components linguistic conditions.The second aspect is the emotion recognition performance.Our proposed model gained a satisfactory recognition result with an accuracy rate of 0.958 surpassing other classifiers.The strength of our proposed model in comparison to other methods is that we applied a fuzzy facial components inference and fuzzy emotion inference, where knowledge is stored in the system.Thus, we did not require any training data for the classification process, unlike FCM and SVM.
The difference between the proposed model and the other three classifiers (FCM, FIS, and SVM) is in the image features.We used high-level linguistic facial components features while other classifiers processed whole face images as input.In our model, we processed only important facial points and

Figure 1 Figure 1 .
Figure 1 shows the diagram of our proposed emotion recognition model.Four subsystems are involved in the model: facial points detection; geometric facial components features extraction; Fuzzy Facial Components5 (a)  shows the screenshot image of the AAM process in locating facial points, while Figure2(b) shows the FPs detection results.
Figure 3(a) shows FC 1 and FC 2 , and Figure 3(b) shows FC 3 areas with their corresponding FPs.
(b)  shows FC 3 areas with their corresponding FPs.

Figure 3 .
Figure 3. Facial component of eyebrows showing (a) left eyebrow FC 1 and right eyebrow FC 2 (b) inner eyebrow FC 3.
mponent of eyebrows showing (a) left eyebrow FC 1 and right eyebrow FC 2 C 3.
ratio and height ratio.Distance ratio of inner eyebrow (FC 3 ) in ed in Equation (3).
mponent of eyebrows showing (a) left eyebrow FC 1 and right eyebrow FC 2 C 3.
ratio and height ratio.Distance ratio of inner eyebrow (FC 3 ) in ed in Equation (3).w FC 3 geometric features are distance ratio and height ratio.Distance ratio of inner eyebrow (FC 3 ) in Figure 3(b) is formulated in Equation 3.

Figure 4
Figure 4(a) shows the left eye (FC 4 ) and right eye (FC 5 ) and their FPs.
component (FC 6 ) with two geometric features: wrinkle ratio and height FC 6 GF 1 indicates how shrinking the nose is measured using Equation (6).omponents showing (a) left eye FC 1 and right eye FC 2 (b) nose FC 6 .

Figure 4
Figure 4(b) shows nose facial component (FC 6 ) with two geometric features: wrinkle ratio and heigh

Figure 6 .
Figure 6.Fuzzy Facial Components Inference System flowchart

Furthermore
, we displayed the confusion matrix which represents the emotion recognition results of our proposed model on facial expression datasets: CK+, JAFFE, DISFA, and IMED.Confusion matrix represents the performance of the classification model.Rows display the number of prediction results, while columns indicate the actual classes.
dis fe ha sa su Prec.Emo.An dis fe ha sa su Prec.98 0.96 0.88 1 0.87 0.98 Rec.0.69 1 0.90 0.90 0.93 0.78 ur proposed model selected the highest intensity value amongst six emotion intensity values as a esult of fuzzy emotion inference system and classified the input images using the six basic emotions.e used four datasets: CK+ (238 images), JAFFE (183 images), DISFA (149 images), and our own acial expression dataset, Indonesian Mixed Emotion Dataset or IMED (270 images) to test emotion ecognition performance.The dataset size is 640x490 pixels for CK+; 256x256 pixels for JAFFE; 280x720 pixels for DISFA; and 720x480 pixels for IMED.Each image contains a single face with rontal orientation.urthermore, we displayed the confusion matrix which represents the emotion recognition results of ur proposed model on facial expression datasets: CK+, JAFFE, DISFA, and IMED.Confusion atrix represents the performance of the classification model.Rows display the number of prediction esults, while columns indicate the actual classes.
(a).The bars have uniform height in each dataset (CK+, JAFFE, DISFA, and IMED).This implies that the proposed model showed good performance in different datasets.Meanwhile, in Figure9(b), our proposed model (the yellow bar) outperformed other classifiers in terms of average accuracy, precision, recall, and F1-scores.

21
Figure 9(a).The bars have uniform height in each dataset (CK+, JAFFE, DISFA, and IMED).This implies that the proposed model showed good performance in different datasets.Meanwhile, in Figure 9(b), our proposed model (the yellow bar) outperformed other classifiers in terms of average accuracy, precision, recall, and F1-scores.

FC Number Facial Component Feature Description Feature Label FC Output
lists the output linguistic variables.
m, and high.Geometric feature values are fuzzified into fuzzy values using functions.Fuzzy triangular membership function is a function which maps a is into a fuzzy value in the y-axis.It uses three parameters α1, α2, α3 of a zification of any input x using Equation (8). 1 membership or fuzzy value; α 1 , α 2 , α 3 are left, middle, and right parameters

Table 3
Facial Components Identification Resul

Table 4
Examples of FFCIS output: Facial components linguistic features

Table 4 (
a) represents the confusion matrix for CK+ dataset;Table 4(b) is the confusion matrix for JAFFE dataset; Table 4(c) is the confusion matrix for DISFA dataset; and Table 4(d) is the confusion matrix for IMED dataset.

Table 4
Examples of FFCIS output: Facial components linguistic features

Table 4
Examples of FFCIS output: Facial components linguistic features

Table 4
Examples of FFCIS output: Facial components linguistic features

Table 4
Examples of FFCIS output: Facial components linguistic features

Table 4
Examples of FFCIS output: Facial components linguistic features

Table 4
Examples of FFCIS output: Facial components linguistic features

Table 4
(a) represents the confusion matrix for CK+ ataset; Table 4(b) is the confusion matrix for JAFFE dataset; Table 4(c) is the confusion matrix for Emo. an dis fe ha sa su Prec.Emo.An dis fe ha sa su Prec.

Table 5
Recognition results from different datasets

Table 6
Recognition results using different methods