GENDER CLASSIFICATION ON SKELETAL REMAINS: EFFICIENCY OF METAHEURISTIC ALGORITHM METHOD AND OPTIMIZED BACK PROPAGATION NEURAL NETWORK

In forensic anthropology, gender classification is one of the crucial steps involved in developing the biological profiles of skeleton remains. There are several different parts of skeleton remains and every part contains several features. However, not all features can contribute to gender classification in forensic anthropology. Besides that, another limitation that exists in previous researches is the absence of parameter optimization for the classifier. Thus, this paper proposed metaheuristic algorithms such as Particle Swarm Optimization, Ant Colony Algorithm and Harmony Search Algorithm based feature selection to identify the most significant features of skeleton remains. Once the set of significant features was obtained, the learning rate and momentum of Back Propagation Neural Network (BPNN) were optimized. This was to obtain a good combination of parameters in order to produce a better gender classification. This study used 1,538 data samples from Goldman Osteometric Dataset which consisted of femur, humerus and tibia parts. Based on the feature selection results, the Optimized BPNN outperformed other methods for all datasets. The Ant Colony Algorithm-Optimized Back Propagation Neural Network produced the highest accuracy for all parts of the skeleton where for femur was 89.44%, the humerus with 88.97% and tibia with 87.52% accuracy. Hence, it can be concluded that


INTRODUCTION
Skeletal remains convey information in the field of forensic anthropology from which anthropologists extract parameters for biological profiles. Classifying gender is one of the tasks in producing a true identification when developing the biological profile (Stanojevich, 2012). There are various parts of skeletal remains including femur, humerus, and tibia which can be analyzed to classify the gender of skeletal remains. In previous work, deoxyribonucleic acid (DNA) analysis was widely used by forensic anthropologists in the forensic laboratory. However, there are disadvantages in DNA analysis because the essential parameter of the biological profile cannot be extracted if the skeleton is burnt or in a damaged condition (Afrianty, Nasien, Kadir, & Haron, 2014).
There are two methods for gender classification from skeletal remains namely the morphologic and osteometric method. The morphologic method involves the observation of sexual traits on bones and the osteometric method is based on measurements and statistical techniques (Akhlaghi, Sheikhazadi, Naghsh & Dorvashi, 2010). Morphologic methods were used in gender classification process in previous work. However, due to the variable nature of skeletal remains, an osteometric method has been considered to be more reliable. Generally, osteometric datasets consist of a large amount of data but lack of information, and cannot supply information regarding gender classification in forensic anthropology (Beniwal & Arora, 2012). As a result, in some situations, the classifier is not good enough and do not work well for datasets with many features. Hsu, Chang and Lin (2008) suggested that feature selection may be needed to improve classification accuracy. The researchers believed that metaheuristic algorithm may work well for datasets with many features. Tu, Chuang, Chang, and Yang (2007) stated that feature selection can serve as a pre-processing tool which leads to an increase in classification accuracy. After a set of significant features is selected, the classifier will be used to classify gender. Since gender classification is important in forensic anthropology, selecting the best classifier is important to ensure that a positive identification can be made. There is a variety of classifiers which have been used by other researchers in the classification process such as Support Vector Machine (SVM), Artificial Neural Network (ANN), J48 and Random Forest. Hence, based on the application of feature selection and classifier, several frameworks have been proposed (Darmawan et al., 2015;Nazir et al., 2014). Both studies proposed a feature selection process to identify the best feature subset by applying a different type of classifier. However, there is a limitation identified in the existing framework which is parameter optimization for the classifier. Parameter optimization is important to identify the best combination of parameters to make sure that the classifier is able to produce the best classification result.
The effectiveness of the classifier depends on the parameter used. Most of the classifier deals with parameter setting before the classification process can proceed. If unsuitable parameters are used, the results obtained may be impermissible (Tsai & Lee, 2011). A good combination of parameters will produce desirable results. However, selecting the parameter is not a simple task as the value of tunable parameters to set can reach to particular amounts. Hence, the idea of carrying out the parameter optimization technique will assist the classifier in producing a good performance in gender classification.
In this study, metaheuristic algorithms which are Particle Swarm Optimization (PSO), Ant Colony Algorithm (ACO) and Harmony Search Algorithm (HSA) based feature selection were optimized with Back Propagation Neural Network (BPNN) and integrated in order to provide an accurate technique for gender classification in forensic anthropology. So, with the best set of features and optimized parameter using BPNN, it was assumed that the performance of gender classification in forensic anthropology could be improved. Based on previous work by Sivakumar and Chandrasekar (2014), PSO was applied to select features which could contribute to the classification of CT scan images while k-Nearest Neighbor (kNN) was used as the classifier in the classification process. Based on the results obtained from the experiment, it showed that the accuracy percentage increased when the significant features were applied for lung classification. It can be concluded that PSO could serve as an ideal pre-processing tool to help optimize the feature selection process. Another related study was proposed by Nazir et al. (2014). The proposed framework and the existing framework were closely related as it focused on gender classification using facial and clothing information. PSO-GA algorithm was applied to perform feature selection with Support Vector Machine (SVM) as a gender classifier.
The difference in the previous work was based on the applied classifier. BPNN was believed to improve the performance of gender classification as it had the advantages of dealing with high dimensionality of data and continuous features. Besides that, there were several works by previous researchers which applied BPNN classifier and metaheuristic algorithm in different areas. The algorithms included PSO (Liu et al., 2015;Lu et al., 2014), Cuckoo Search (Nazir et al., 2014 andACO (Erguzel et al., 2014). All these algorithms worked in different ways with the combination of BPNN. Lu et al, (2014) and Jin et al. (2012) applied metaheuristic algorithm as the optimizer for BPNN while Li et al. (2017) applied GA as a feature selection in the classification of electrocardiogram (ECG). Based on the research, BPNN worked with general parameter settings without any optimization process. There were limitations with this process as the parameter set may not be suitable with ECG classification. Besides this, there was another work proposed by Erguzel et al. (2014). ACO algorithm was applied to select the best set of features while BPNN was used as a classifier for QEEG data classification. The combination of these two algorithms produced good classification accuracy compared to the classification without ant feature selection technique. However, there is a difference with the proposed technique in this paper, as the proposed technique applied parameter optimization for BPNN, while previous studies used the direct classification process without the parameter optimization process.
Based on the discussion of existing work, the first gap that can be seen from the existing work is that the number of features to select is fixed and this would ignore other features that could contribute to better gender classification performance. Besides this, SVM classifier is used as it is without a parameter optimization process even though parameter optimization helps in optimizing the classifier to improve classification performance. Hence, based on the related studies, it can be concluded that the proposed technique for gender classification in this paper has shown some differences compared to other process designs.

MATERIALS AND METHODS
A total of 1,538 skeleton measurements from Goldman Osteometric dataset were used in this study (Sorg, 2005). These measurements have been used in the United States cases and all are in millimeters. The sample of the dataset can be taken from https://web.utk.edu/~auerbach/ GOLD.htm and this dataset has been used by several researchers in their studies. This showed that the dataset is suitable for researchers to conduct research on skeletal remains in forensic anthropology. All the measurements of skeletons were taken bilaterally from three long bones: femur, humerus, and tibia. Every part of the skeleton contains different features which contribute to gender classification. Table 1 shows the characteristics of the Goldman Osteometric dataset, while Table 2 shows the description of features involved in this study.

255
Journal of ICT, 19, No. 2 (April) 2020, pp: 251-277 Table 2 Description of features from goldman osteometric dataset (Sorg, 2005) No. The HML diameter at the midpoint of the diaphysis parallel to the plane between the medial and lateral epicondyles.

5
Humerus 50% Diaphyseal Anteroposterior Diameter (HAPD) The measurement is taken at the midpoint of the diaphysis perpendicular to the plane between the medial and lateral epicondyles. The measurement is taken from lateral condyle to the medial condyle.
10 Femur Head Anteroposterior Diameter (FHD) The measurement is taken with an orientation perpendicular to the long axis of the femoral diaphysis.
11 Femur 50% Diaphyseal Mediolateral Diameter (FMLD) The mediolateral diameter of FML at the 50% (midpoint) of the diaphysis is taken perpendicular to the long axis of the femoral diaphysis (continued)

Feature Description
12 Femur 50% Diaphyseal Anteroposterior Diameter (FAPD) The measurement is taken at the midpoint of the diaphysis, perpendicular to the plane of the FMLD 13 Tibia Maximum Length (TML) The measurement is taken from the intercondylar eminence to the most distal aspect of the medial malleolus.
14 Tibia Plateau Mediolateral (Bicondylar) Breadth (TPB) The measurement is taken at the tibial plateau of the condyles beyond the articular surfaces.
15 Tibia 50% Diaphyseal Mediolateral Diameter (TMLD) The diameter is taken perpendicular to the long axis of the femoral diaphysis.
16 Tibia 50% Diaphyseal Anteroposterior Diameter (TAPD) The measurement is taken at the midpoint of the diaphysis, perpendicular to the long axis of the femoral diaphysis This study highlights the application of metaheuristic algorithms: PSO, ACO and HSA based features selection incorporated with optimized BPNN for gender classification. There are four main steps in this study where the first step is pre-processing step and followed by the feature selection process. The third step is classifying gender using optimized BPNN and the final step is the validation process. Figure 1 shows the overall flowchart proposed in this study.

Data Pre-processing
Essentially, there is a need for data pre-processing in order to get an appropriate range for model development using machine learning (Jiawei, Kamber, Han, Kamber, & Pei, 2012). Thus, high quality data was an important component before running the experiment to produce a good performance using machine learning. The pre-processing phase included data cleaning and data normalization. According to Minakshi Vohra, & Gimpy. (2014), most of the real world database cannot avoid from problems related to missing information in the input values. With the presence of missing values in the dataset, it causes problems in extracting important information. So, data cleaning should be taken into consideration as a step in the gender classification process to remove data with missing values. The next step in the pre-processing phase was normalization. Normalization gives good effects on the classification algorithm that is related to neural network (Jiawei et al., 2012;Han, 2012). The input or output of dataset was scaled from 0 to 1 to minimize redundancy and dependency of the attributes. Normalizing the data was to make the data model more informative and to avoid the problem of a large range of attributes (Jiawei et al., 2012). Table 3 shows the data sample before pre-processing while Table 4 shows the processed data after cleaning and normalization.

Metaheuristic Algorithm based Feature Selection
Feature selection technique was necessary for classification to reduce processing time and increase predictive accuracy. Irrelevant features do not provide any useful knowledge or information to predict the target concept and improve predictive performance and preclude over-fitting (Ali & Shahzad, 2012). Thus, feature selection is useful in reducing the number of insignificant, irrelevant, and noisy features that affect classification results. Different types of feature selection algorithms will produce a different set of relevant features and different predictive accuracy rates. Hence, this study applied PSO, ACO and HSA algorithms in order to select the most significant features in gender classification.

Particle Swarm Optimization
PSO is one of the optimization techniques from the group of evolutionary computation techniques. This technique is derived from research on swarm such as bird flocking and fish schooling (Fan & Jen, 2019). In the PSO algorithm, instead of using evolutionary operators such as mutation and crossover to manipulate algorithms, for a d -variable optimization problem, a flock of particles are put into the d -dimensional search space with randomly chosen velocities and positions knowing their best values (Ahmad, 2015). The velocity of each particle, adjusted according to its own flying experience and the other particles' flying experience (Ahmad, 2015). The i th particle is represented as The best previous position of the i th particle is recorded using Equation 1. (1) Where pbest denotes the personal best, gbest denotes global best, i denotes the position of particle and d denotes the dimensional space. The index of best particle among all of the particles in the group is d gbest . The velocity for particle i th particle is recorded using Equation 2. (2) Where v denotes the velocity of particle.
The modified velocity and position of each particle can be calculated using the current velocity and distance from

Metaheuristic Algorithm based Feature Selection
Feature selection technique was necessary for classification to reduce processing time and increa predictive accuracy. Irrelevant features do not provide any useful knowledge or information to predict t target concept and improve predictive performance and preclude over-fitting (Ali & Shahzad, 2012 Thus, feature selection is useful in reducing the number of insignificant, irrelevant, and noisy features th affect classification results. Different types of feature selection algorithms will produce a different set relevant features and different predictive accuracy rates. Hence, this study applied PSO, ACO and HS algorithms in order to select the most significant features in gender classification.

Particle Swarm Optimization
PSO is one of the optimization techniques from the group of evolutionary computation techniques. Th technique is derived from research on swarm such as bird flocking and fish schooling (Fan and Je 2019). In the PSO algorithm, instead of using evolutionary operators such as mutation and crossover manipulate algorithms, for a d -variable optimization problem, a flock of particles are put into the d dimensional search space with randomly chosen velocities and positions knowing their best valu (Ahmad, 2015). The velocity of each particle, adjusted according to its own flying experience and t other particles' flying experience (Ahmad, 2015). The i th particle is represented ) ,... , ( The best previous position of the i th particle Where pbest denotes the personal best, gbest denotes global best, i denotes the position of particle and d denotes the dimensional space. The index of best particle among all of the particles in the group is d gbest . The velocity for particle i particle is recorded using equation 2.
Where v denotes the velocity of particle.
The modified velocity and position of each particle can be calculated using the current velocity an distance from Where v denotes the velocity of particle.
The modified velocity and position of each particle can be calculated using the current velocity and distance from

Ant Colony Algorithm
ACO is an inspiration of real ant colonies and ants move from vertex to vertex in order to exploit the (Jaiswal & Aggarwal, 2011). Ants construct solution components from a set of n and solution components C = {c ij }. A partial solution construction starts its empty solution s p = 0. Each construction step, s p is extended by adding the solution from the components of N(s p ) ϵ C|s p . The probabilistic choice of solution components is shown in Equation 4. (4) where is the pheromone value associated with component i j c and is a weighting function that assigns a construction step a heuristic value to each feasible solution component and are the positive parameters which determine the relation between pheromone information and heuristic information.
The pheromone value will be updated in order to achieve the increasing pheromones level, while the decreasing pheromones values will go through the evaporation of pheromone, using Equation 5. where is the evaporation rate. Basically, the best decisions or solutions found earlier by the ants are used to update the pheromone rate to increase the probability in the searching space (Ahmad, 2015). Another component in ACO algorithm aims to centralize action which is performed where ij ! is the pheromone value associated with component ij c and is a weighting function that assigns a construction step a heuristic value to each feasible solution component . ! and ! are the positive parameters which determine the relation between pheromone information and heuristic information.
The pheromone value will be updated in order to achieve the increasing pheromones level, while the decreasing pheromones values will go through the evaporation of pheromone, using Equation 5.
is the evaporation rate. Basically, the best decisions or solutions found earlier by the ants The modified velocity and position of each particle can be calculated using the current velocity distance from   The pheromone value will be updated in order to achieve the increasing pheromones level, while decreasing pheromones values will go through the evaporation of pheromone, using Equation 5.

( )
is the evaporation rate. Basically, the best decisions or solutions found earlier by the 8 The modified velocity and position of each particle can be calculated using the current velocity and distance from where ij ! is the pheromone value associated with component ij c and is a weighting function that assigns a construction step a heuristic value to each feasible solution component . ! and ! are the positive parameters which determine the relation between pheromone information and heuristic information.
The pheromone value will be updated in order to achieve the increasing pheromones level, while the decreasing pheromones values will go through the evaporation of pheromone, using Equation 5.
is the evaporation rate. Basically, the best decisions or solutions found earlier by the ants 8 of particle. sition of each particle can be calculated using the current velocity and ion rate. Basically, the best decisions or solutions found earlier by the ants 8 ty of particle. osition of each particle can be calculated using the current velocity and  . ! and ! are the determine the relation between pheromone information and heuristic e updated in order to achieve the increasing pheromones level, while the s will go through the evaporation of pheromone, using Equation 5.
tion rate. Basically, the best decisions or solutions found earlier by the ants 8 Where v denotes the velocity of particle.
The modified velocity and position of each particle can be calculated using the current velocity and distance from  The pheromone value will be updated in order to achieve the increasing pheromones level, while the decreasing pheromones values will go through the evaporation of pheromone, using Equation 5.
is the evaporation rate. Basically, the best decisions or solutions found earlier by the ants Where v denotes the velocity of particle.
The modified velocity and position of each particle can be calculated using the current velocity   The pheromone value will be updated in order to achieve the increasing pheromones level, while decreasing pheromones values will go through the evaporation of pheromone, using Equation 5.
is the evaporation rate. Basically, the best decisions or solutions found earlier by the a by more than one ant. A collection of information can be used to initiate extra pheromone to enable the search process to seek better solutions.

Harmony Search Algorithm
HSA is a natural music-based optimization algorithm and the musician's goal is to search for a perfect state of harmony (Dempster & Drake, 2016). In music improvisation, to produce one harmony vector, each player will produce any pitch at a feasible range (Dempster & Drake, 2016). Once all the pitches produce a good solution, each variable's memory will store that experience, and it will increase the possibility of producing a good solution the next time (Diao & Shen, 2012). Parameters used in HSA include: size of harmony memory ) (HMS , harmony memory (HM), harmony memory considering rate (HMCR ), and pitch adjusting rate (PAR). The procedure of harmony search starts with initialization of the problem and algorithm parameters. Optimization problem is specified as min subject to g > 0 and is the inequality constraint function and h(x) is the equality constraints function. HM matrix in HM is initialized randomly. A new harmony vector is generated based on three rules which are: memory consideration, pitch adjustment and random selection using Equation 6.
where bw is arbitrary distance and is a random number between 0 and 1. PAR and bw are adjusted using Equation 8.
where gn = 1,2,…, NI, PAR(gn) is the pitch adjusting rate for generation or improvisation of gn, NI is the number of improvisation, min PAR is the minimum pitch adjusting rate and max PAR is the maximum pitch adjusting rate. If the new memory is better than the previous memory in HM, the new harmony memory is included in the HM and the existing worst harmony is excluded from HM. 9 e to increase the probability in the searching space (Ahmad, 2015). aims to centralize action which is performed by more than one ant. ed to initiate extra pheromone to enable the search process to seek ization algorithm and the musician's goal is to search for a perfect e, 2016). In music improvisation, to produce one harmony vector, t a feasible range (Dempster & Drake, 2016). Once all the pitches ble's memory will store that experience, and it will increase the tion the next time (Diao & Shen, 2012). Parameters used in HSA ) MS , harmony memory (HM), harmony memory considering rate R). search starts with initialization of the problem and algorithm is specified as pitch adjusting rate for generation or improvisation of gn, NI is the he minimum pitch adjusting rate and max PAR is the maximum pitch etter than the previous memory in HM, the new harmony memory is orst harmony is excluded from HM. l Network 9 are used to update the pheromone rate to increase the probability in the searching space (Ahmad, 2015). Another component in ACO algorithm aims to centralize action which is performed by more than one ant. A collection of information can be used to initiate extra pheromone to enable the search process to seek better solutions.

Harmony Search Algorithm
HSA is a natural music-based optimization algorithm and the musician's goal is to search for a perfect state of harmony (Dempster & Drake, 2016). In music improvisation, to produce one harmony vector, each player will produce any pitch at a feasible range (Dempster & Drake, 2016). Once all the pitches produce a good solution, each variable's memory will store that experience, and it will increase the possibility of producing a good solution the next time (Diao & Shen, 2012). Parameters used in HSA include: size of harmony memory ) (HMS , harmony memory (HM), harmony memory considering rate (HMCR), and pitch adjusting rate (PAR).
The procedure of harmony search starts with initialization of the problem and algorithm parameters. Optimization problem is specified as where gn = 1,2,…, NI, PAR(gn) is the pitch adjusting rate for generation or improvisation of gn, NI is the number of improvisation, min PAR is the minimum pitch adjusting rate and max PAR is the maximum pitch adjusting rate. If the new memory is better than the previous memory in HM, the new harmony memory is included in the HM and the existing worst harmony is excluded from HM.
Optimized Back Propagation Neural Network 9 are used to update the pheromone rate to increase the probability in the searching space (Ahmad, 2015). Another component in ACO algorithm aims to centralize action which is performed by more than one ant. A collection of information can be used to initiate extra pheromone to enable the search process to seek better solutions.

Harmony Search Algorithm
HSA is a natural music-based optimization algorithm and the musician's goal is to search for a perfect state of harmony (Dempster & Drake, 2016). In music improvisation, to produce one harmony vector, each player will produce any pitch at a feasible range (Dempster & Drake, 2016). Once all the pitches produce a good solution, each variable's memory will store that experience, and it will increase the possibility of producing a good solution the next time (Diao & Shen, 2012). Parameters used in HSA include: size of harmony memory ) (HMS , harmony memory (HM), harmony memory considering rate (HMCR), and pitch adjusting rate (PAR).
The procedure of harmony search starts with initialization of the problem and algorithm parameters. Optimization problem is specified as where gn = 1,2,…, NI, PAR(gn) is the pitch adjusting rate for generation or improvisation of gn, NI is the number of improvisation, min PAR is the minimum pitch adjusting rate and max PAR is the maximum pitch adjusting rate. If the new memory is better than the previous memory in HM, the new harmony memory is included in the HM and the existing worst harmony is excluded from HM.
Optimized Back Propagation Neural Network 9 are used to update the pheromone rate to increase the probability in the searching space (Ahmad, 2015). Another component in ACO algorithm aims to centralize action which is performed by more than one ant. A collection of information can be used to initiate extra pheromone to enable the search process to seek better solutions.

Harmony Search Algorithm
HSA is a natural music-based optimization algorithm and the musician's goal is to search for a perfect state of harmony (Dempster & Drake, 2016). In music improvisation, to produce one harmony vector, each player will produce any pitch at a feasible range (Dempster & Drake, 2016). Once all the pitches produce a good solution, each variable's memory will store that experience, and it will increase the possibility of producing a good solution the next time (Diao & Shen, 2012). Parameters used in HSA include: size of harmony memory ) (HMS , harmony memory (HM), harmony memory considering rate (HMCR), and pitch adjusting rate (PAR).
The procedure of harmony search starts with initialization of the problem and algorithm parameters. Optimization problem is specified as where gn = 1,2,…, NI, PAR(gn) is the pitch adjusting rate for generation or improvisation of gn, NI is the number of improvisation, min PAR is the minimum pitch adjusting rate and max PAR is the maximum pitch adjusting rate. If the new memory is better than the previous memory in HM, the new harmony memory is included in the HM and the existing worst harmony is excluded from HM.
Optimized Back Propagation Neural Network 9 are used to update the pheromone rate to increase the probability in the searching space (Ahmad, 2015). Another component in ACO algorithm aims to centralize action which is performed by more than one ant. A collection of information can be used to initiate extra pheromone to enable the search process to seek better solutions.

Harmony Search Algorithm
HSA is a natural music-based optimization algorithm and the musician's goal is to search for a perfect state of harmony (Dempster & Drake, 2016). In music improvisation, to produce one harmony vector, each player will produce any pitch at a feasible range (Dempster & Drake, 2016). Once all the pitches produce a good solution, each variable's memory will store that experience, and it will increase the possibility of producing a good solution the next time (Diao & Shen, 2012). Parameters used in HSA include: size of harmony memory ) (HMS , harmony memory (HM), harmony memory considering rate (HMCR), and pitch adjusting rate (PAR).
The procedure of harmony search starts with initialization of the problem and algorithm parameters. Optimization problem is specified as where gn = 1,2,…, NI, PAR(gn) is the pitch adjusting rate for generation or improvisation of gn, NI is the number of improvisation, min PAR is the minimum pitch adjusting rate and max PAR is the maximum pitch adjusting rate. If the new memory is better than the previous memory in HM, the new harmony memory is included in the HM and the existing worst harmony is excluded from HM.

Optimized Back Propagation Neural Network
Back Propagation Neural Network (BPNN) contains three (3) layers of structure which are input layer, hidden layer, and output layer (Jaswante, Khan, & Gour, 2014). Neurons are available at each layer and they are attached to each other in each layer. In this experiment, the number of the input layer was based on the number of features for the dataset and the number of output neuron was according to gender classification which was either male or female.
The architecture in this experiment was based on the work of previous researchers. The number of the hidden layers used in this experiment was two and the number of hidden neurons was based on the criteria of the rule of thumb by previous researcher (Attoh-Okine, 1999). The second step in BPNN was to set the parameter in the BPNN. The parameters involved were momentum and learning rate. It was important to select the best combination of these two parameters. If unsuitable parameters were selected, the results obtained may be impermissible (Tsai & Lee, 2011). Since parameters affect the performance of the classifier, the aim of this paper was to find the best combination of momentum and learning rate by applying an automated parameter tuning.
In previous work, self-tuning parameter was used to find out the best parameter to be incorporated into the classification process. Based on the selftuning experiment, the value of momentum obtained was 0.1, while learning rate was 0.9. However, this type of parameter tuning had disadvantages as it consumed a lot of time to tune the parameter and was not applicable for a large range of parameter values. By taking into account these disadvantages in the self-tuning parameter, automated parameter tuning with cross-validation method incorporated with grid search algorithm was applied in this experiment to automatically determine the best combination of momentum and learning rate without the trial and error process. Figure 2 illustrates how automated parameter tuning works in order to pick the best set of learning rate and momentum.
Grid search assembles every possible combination of values (Bergastra & Bengio, 2012). Grid search will train the classifier with each pair of the parameter and evaluate the performance on the validation set. The range in momentum and learning rate used in this experiment were [0.1-0.3] and [0.7-0.9], respectively. Grid search is simple to implement and reliable in low dimensional spaces (Bergastra & Bengio, 2012). However, in the parameter optimization process, there is one problem where parameter optimization can lead to overfitting. The problem of overfitting can be mitigated by implementing cross-validation technique for parameter optimization. Crossvalidation works by training the classifier with a set of data and testing it with different sets of data. Grid search with 10 fold cross-validation is applied to evaluate the performance of each pair of momentum and learning rate. The set of parameter combination with the highest accuracy produced by BPNN is then selected. Once the best set of the parameter has been selected, the data will go through the process of gender classification. The dataset needs to be split into training and testing sets using k-fold cross-validation. Five-fold cross-validation is used to prepare five different sets of training and testing data. Five-fold crossvalidation is implemented for each dataset to determine the average accuracy of gender classification. The performance in gender classification is measured from the accuracy obtained. The formula on accuracy is shown in Equation 9.
(9) 10 obtained was 0.1, while learning rate was 0.9. However, this type of parameter tuning had disadvantages as it consumed a lot of time to tune the parameter and was not applicable for a large range of parameter values. By taking into account these disadvantages in the self-tuning parameter, automated parameter tuning with cross-validation method incorporated with grid search algorithm was applied in this experiment to automatically determine the best combination of momentum and learning rate without the trial and error process. Figure 2 illustrates how automated parameter tuning works in order to pick the best set of learning rate and momentum. Once the best set of the parameter has been selected, the data will go throug gender classification. The dataset needs to be split into training and testing sets us validation. Five-fold cross-validation is used to prepare five different sets of training Five-fold cross-validation is implemented for each dataset to determine the average ac classification. The performance in gender classification is measured from the accurac formula on accuracy is shown in Equation 9.  (Bergastr & Bengio, 2012). However, in the parameter optimization process, there is one problem where paramete optimization can lead to overfitting. The problem of overfitting can be mitigated by implementing cross validation technique for parameter optimization. Cross-validation works by training the classifier with set of data and testing it with different sets of data. Grid search with 10 fold cross-validation is applied t evaluate the performance of each pair of momentum and learning rate. The set of parameter combinatio with the highest accuracy produced by BPNN is then selected.
Once the best set of the parameter has been selected, the data will go through the process o gender classification. The dataset needs to be split into training and testing sets using k-fold cross validation. Five-fold cross-validation is used to prepare five different sets of training and testing data Five-fold cross-validation is implemented for each dataset to determine the average accuracy of gende classification. The performance in gender classification is measured from the accuracy obtained. Th formula on accuracy is shown in Equation 9.  The process of gender classification was carried out using BPNN with two types of parameter setting. The first one was the standard parameter setting based on previous researchers (Tsai & Lee, 2011;Rene et al., 2013) and the second parameter setting was an Automated Parameter Tuning as shown in Figure 3. Automated parameter tuning was based on the parameter optimization incorporating grid search algorithm and cross-validation which was carried out in the experiment.

Result Validation
The last step is the validation of results by using t-test statistical analysis. These are iterative steps to identify whether the algorithm applied produced a significant or insignificant result. The t-test is an excellent method to show the relationship between features and has become a researcher's favourite method in the verification process. Cattaneo (2007) applied sample t-test as a powerful method and well performed in observers over features. In the t-test method, the performance is calculated by p-value. If the p-value is below 0.05, then the result produced is a significant result (Ye et al., 2016). Calculation of t-test can be performed by Equation 10   (10) where; SD = Standard deviation N = Sample size

RESULTS AND DISCUSSION
This section illustrates the evaluation of gender classification with feature selection and optimized BPNN. The results are divided into three subsections which are the results and analysis of selected features based on PSO, ACO, and HSA using BPNN; results and analysis of gender classification using optimized BPNN and the verification of t-test based on selected features and accuracy. All the performance obtained will be compared in order to find out the effectiveness of feature selection and parameter optimization on the gender classification framework. Table 5 shows the results of selected features based on three different feature selection techniques. The identification of best significant features is selected within the circle of highest fitness value as it contains more information in the features. Based on Table 5, it can be seen that PSO and HSA select eight (8) same features for all datasets, while ACO selects a set of different features. However, for the tibia part, each algorithm selects the same significant features which are seven (7) features. The significance of the selected features are then tested in the statistical test to identify whether the selected features are significant or not. The results of significant features are shown in the 'Verification of t-test' part. All the selected features will lead to the performance of gender classification in forensic anthropology.

RESULTS AND DISCUSSION
This section illustrates the evaluation of gender classification with feature selection and optimized BPNN.
The results are divided into three subsections which are the results and analysis of selected features based on PSO, ACO, and HSA using BPNN; results and analysis of gender classification using optimized BPNN and the verification of t-test based on selected features and accuracy. All the performance obtained will be compared in order to find out the effectiveness of feature selection and parameter optimization on the gender classification framework. Table 5 shows the results of selected features based on three different feature selection techniques. The identification of best significant features is selected within the circle of highest fitness value as it contains more information in the features. Based on Table 5, it can be seen that PSO and HSA select eight (8) same features for all datasets, while ACO selects a set of different features. However, for the tibia part, each algorithm selects the same significant features which are seven (7) features. The significance of the selected features are then tested in the statistical test to identify whether the selected features are significant or not. The results of significant features are shown in the 'Verification of t-test' part. All the selected features will lead to the performance of gender classification in forensic anthropology. In order to find out the effect of feature selection in gender classification performance, BPNN 12

Result Validation
The last step is the validation of results by using t-test statistical analysis. These are iterative steps to identify whether the algorithm applied produced a significant or insignificant result. The t-test is an excellent method to show the relationship between features and has become a researcher's favourite method in the verification process. Cattaneo (2007) applied sample t-test as a powerful method and well performed in observers over features. In the t-test method, the performance is calculated by p-value. If the p-value is below 0.05, then the result produced is a significant result (Ye et al., 2016). Calculation of t-test can be performed by Equation 10 error standard mean value of comparison = t (10) In order to find out the effect of feature selection in gender classification performance, BPNN classification was performed on the list of data with full features and selected features to identify the accuracy of gender classification. The parameter setting of BPNN used was based on the self-tuning parameter results. The results obtained from BPNN with full features dataset was compared with the dataset of selected features from PSO, ACO, and HSA based feature selection. The comparison analysis was performed to seek the view of feature selection functionality and to verify the feature selection algorithm as one of the good solutions to conduct gender classification in forensic anthropology. Table 6 represents the results of BPNN with full features dataset and selected features dataset. Based on Table 6, it can be deduced that feature selection techniques provided a good impact on the accuracy of the gender classification process. Each classification result showed the increment of accuracy rate when the dataset with selected features was applied. For the femur part, ACO-BPNN produced the highest accuracy compared to PSO-BPNN and HSA-BPNN where it increased from 85.48% to 87.03%. For the humerus part, PSO-BPNN and HSA-BPNN produced the same accuracy when selected features datasets were applied where it increased to 1.41% in accuracy rate. For the tibia part, PSO-BPNN and HSA-BPNN increased to 0.66% of its accuracy rate, while ACO-BPNN increased to 0.44%. Hence, it can be concluded that different datasets produced different levels of classification accuracy as different techniques of feature selection were applied to the dataset.

Result and Analysis of Gender Classification using OBPNN
Automated parameter tuning or parameter optimization aims to find out the optimal parameter which is suited for the BPNN classifier. The parameter optimization process uses an automated parameter tuning in order to find the best combination of parameters using cross-validation method incorporated with grid search algorithm. The results of parameter optimization are different based on the dataset applied. Table 7 presents the results of an optimized parameter of BPNN on learning rate and momentum based on the list of different datasets. The optimized parameter obtained from the process of parameter tuning was used to find out the performance of gender classification in forensic anthropology. BPNN classification was performed on the list of the dataset with an optimized parameter to identify the accuracy of gender classification with full features and selected features dataset. In order to have the insights of functionality on using optimized BPNN, the comparison analysis between BPNN and optimized BPNN was performed. The main objective was to verify that the proposed optimized BPNN was able to deliver a better performance on the gender classification process in forensic anthropology. A comparative analysis between BPNN and optimized BPNN with selected features and full features are presented in Table 8 as follows. Based on Table 8, it can be seen that the optimized parameter gave a good impact on gender classification performance. Each dataset showed the increment of accuracy rate when optimized BPNN was used as a classifier compared to only BPNN. The increment of accuracy rate on the full features dataset was 0.33% for the femur part, 0.17% for the humerus part and 0.16% for the tibia part. Based on the selected features dataset, the humerus part based on ACO features showed the highest increment in accuracy rate which was from 84.83% to 88.97%. Hence, it can be concluded that optimized BPNN gave a good effect on gender classification performance in forensic anthropology.
The main contribution of this study is the analysis of gender classification performance when features selection with optimized BPNN was applied in the gender classification process. Hence, the performance of gender classification using BPNN and feature selection with optimized BPNN was analyzed. For the femur part, BPNN produced 85.48% accuracy and it increased to 87.48% for PSO-OBPNN, 89.44% for ACO-OBPNN and 89.04% for HSA-OBPNN. For the humerus part, the accuracy rate obtained for BPNN was 83.58% and it increased to 85.73% for PSO-OBPNN, 88.97% for ACO-OBPNN and 88.81% for HSA-OBPNN. While for the tibia part, BPNN produced 83.72% accuracy and it increased to 85.23% for PSO-OBPNN, 87.52% for ACO-OBPNN and 87.36% for HSA-OBPNN. With the comparison in results based on BPNN and feature selection with optimized BPNN, it has been proven that the combination of feature selection and parameter optimization, helped in producing a good performance on gender classification. Apart from the obtained good performance, a verification phase was conducted to find out whether the proposed gender classification framework was acceptable or not. Hence, the next section will discuss the results produced from the verification phase of gender classification.

Verification of T-test based on List of Features and Accuracy
The t-test verification process was implemented in order to verify whether the proposed gender classification framework was significant or not. This process was well performed on features and accuracy in gender classification.

Verification of T-test based on List of Features
The verification process was conducted for all three parts of the skeleton and the first part was the femur. Table 9 shows the statistical values of mean, standard deviation (S.D), standard error (S.E) and p-value presented for the femur part based on both genders. Based on Table 7, there were six (6) out of 14 features which were not significant. They included LFBL, LFMLD, RFML, RFBL, RFMLD, and RFAPD. These insignificant features were perfectly matched with the results of the PSO and HSA based feature selection process while there was one insignificant feature that was selected by the ACO feature selection which was RFAPD. The next skeleton part is humerus. Table 10 shows the statistical values of mean, standard deviation (S.D), standard error (S.E) and p-value presented for the humerus part based on both genders. Based on Table 8, there were three features which produced the value of more than 0.05. It showed that all three features were insignificant and the features were the same, when PSO and HSA based feature selection were applied to the dataset. However, it was different with the features selected by ACO as LHML feature was significant; but it was not selected by the ACO algorithm. Instead LHAPD which was insignificant was selected by the ACO algorithm. The last part is the tibia. Table 11 shows the statistical values of mean, standard deviation (S.D), standard error (S.E) and p-value presented for the tibia part based on both genders. There was one insignificant feature produced by the t-test process which was RTML feature and it matched perfectly with the selected features based on PSO, ACO, and HSA algorithm.

Verification of T-test based on Accuracy
The process of verification based on the accuracy of gender classification was conducted in order to find out whether the difference in accuracy between BPNN and PSO-OBPNN, ACO-OBPNN and HSA-OBPNN was significant or not. Table 12 to 14 shows the statistical values of mean, standard deviation (S.D), standard error (S.E) and p-value presented to test the significance of accuracy. Based on the results obtained for Tables 10 and 12, it showed that the p-value produced from the analysis of all models was less than 0.05. Thus, it indicated that the results between the models were significant. However, for Table 12, it showed that the p-values produced between BPNN and ACO-BPNN model were insignificant. This was because of the insignificant feature selected by the ACO. Overall, it can be concluded that it is significant for the dataset to apply feature selection and parameter optimization in performing the gender classification process in forensic anthropology.    Journal of ICT, 19, No. 2 (April) 2020, pp: 251-277 CONCLUSION Gender classification in forensic anthropology aims to classify skeletal remains to determine whether remains are male or female. There are several main issues which could be highlighted. One of the main issues in gender classification is that it always deals with many features even though some features are unnecessary in the classification process. Another issue is parameter setting for the classifier in order to produce a good performance of gender classification.
If an unsuitable parameter is applied, then classification performance will be affected. Hence, to overcome these drawbacks, this paper aims to propose a gender classification framework to select the most significant features by applying metaheuristic algorithm based feature selection and parameter optimization for BPNN classifier.
Based on the proposed framework, the results showed that not all features are necessary to produce a better performance of gender classification in forensic anthropology. Besides that, the optimized BPNN also showed better results compared to BPNN. Hence, a combination of feature selection and parameter optimization for the classifier is a good solution in order to enhance accuracy of gender classification. The results of the feature selection-Optimized BPNN outperformed the results of BPNN and feature selection-BPNN. It can be concluded that feature selection and parameter optimization provided a good outcome in the gender classification process.
In future, improvements on gender classification in forensic anthropology can be conducted by applying other classifiers, such as Spiking Neural Network (SNN) which could deliver better results. Besides, parameter optimization can be applied by adapting metaheuristic algorithms instead of the grid search and cross-validation technique.