HIGH ACCURACY EEG BIOMETRICS IDENTIFICATION USING ICA AND AR MODEL

Modern biometric identification methods combine interdisciplinary approaches to enhance person identification and classification accuracy. One popular technique for this purpose is Brain-Computer Interface (BCI). The signal so obtained from BCI will be further processed by the Autoregressive (AR) Model for feature extraction. Many researches in the area find that for more accurate results, the signal must be cleaned before extracting any useful feature information. This study proposes Independent Component Analysis (ICA), k-NN classifier, and AR as the combined techniques for electroencephalogram (EEG) biometrics to achieve the highest personal identification and classification accuracy. However, there is a classification gap between using the combined ICA with the AR model and AR model alone. Therefore, this study takes one step further by modifying the feature extraction of AR and comparing the outcome with the proposed approaches in lieu of prior researches. The experiment based on four relevant locations shows that the combined ICA and AR can achieve higher accuracy than the modified AR. More combinations of channels and subjects are required in future research to explore the significance of channel effects and to enhance the identification accuracy.


INTRODUCTION
Biometrics is a person authentication and identification technique using several real organs.Biometrics function can be divided into two steps: identification and authentication.The first step performs verification and validation of individuals in a database or a group of persons.The latter step performs acceptance of individuals.Biometrics is likely to be feigned due to medical advance and information technology development such as plastic surgery, high resolution devices, or advanced digital tools.An electroencephalography (EEG) is considered as one biometric trait because it is generated from the human brain by recording the electrical activity of millions of neurons from the same position (Teplan, 2002).Nevertheless, owing to noninvasive, inexpensive, and portable EEG device, many researches focus on using EEG as biometrics (Maiorana et al., 2016;Mu, et al., 2016;Rocca, et al., 2012;Tangkraingkij, et al., 2013).
Generally, EEG can be used to diagnose the function of the brain.The functional area in the human brain is divided into four areas, namely frontal, parietal, temporal and occipital lobes.The frontal lobe is liable for conscious thought.The parietal lobe combines sensory information from various senses.The temporal lobe performs auditory sense.Sense of sight is processed by the occipital lobe.EEG can be grouped into five different rhythms based on their frequencies: Delta rhythm (1-4 Hz) is seen during deep sleep in adults and in infants as an unusual activity; Theta rhythm (4-8 Hz) occurs in the drowsiness of adults and in the waking up of children; Alpha rhythm (8-12 Hz) is seen normally while relaxing with eyes closed; Beta rhythm (12-30 Hz) is associated with anxious thinking and active concentration; and Gamma rhythm (30-100 Hz) is associated with specific cognitive states.Many researches have focused on person authentication based on these EEG rhythms (Campisi et al., 2011;Rocca et al., 2012).Furthermore, EEG biometrics can be used in Event-Related Potential (ERP).ERP measures brain waves while each subject does some activities such as looking at a picture, listening to audio, fisting, and so on.Many researches on ERP using EEG biometrics obtain satisfactor high accuracy (Kumari & Vaish, 2016;Ruiz-Blonded et al., 2016).
In neuroscience, Independent Component Analysis (ICA) is an important technique used to remove undesirable artifacts in EEG signals to obtain purer brain waves from each location of the brain (Albera et al., 2012).However, there are only a few studies concentrating on ICA in EEG signal analysis (He & Wang, 2009;Tangkraingkij et al., 2013), but none uses ICA to clean EEG signals.As a consequence, we set out to explore these combined techniques in this study to attain higher person identification accuracy.
The organization of this paper is as follows.Some related work will be recounted in the next section.Then, background theories are discussed, followed by the methodology.Next, the experimental results are described, and some final thoughts are given in the conclusion and future work section.

RELATED WORK
In Brain Computer Interface (BCI), it was noted that ICA could be used to extract the EEG signals so as to obtain sources of brain signals (Albera et al., 2012;Delorme et al., 2012).In EEG biometrics study, Tangkraingkij's researches (Tangkraingkij et al., 2010;Tangkraingkij et al., 2013) used ICA to clean EEG signals collected from 20 subjects in the resting state.They grouped the data into varied lengths (500, 1000, 1500, and 3000 data points) and used artificial neuron network for classification.Their results showed that 4-channels of EEG at 1,000 data points per channel achieved high average identification accuracy at 98.51%.The limitations of their work were too many data points were used for classification, namely 4,000 data points for 4 channels.Unfortunately, such a highly accurate result could only be obtained in offline processing.He and Wang (2009) used the AR model and 5 dominating Independent Components from 5 brain areas with 7 subjects in motion tasks.The identification accuracy was not as high as anticipated.
For this reason, more sophisticated feature extraction techniques were required in the identification process such as spectral coherence (Del Pozo-Banos et al., 2014), power spectrum (Harshit et al., 2016), fuzzy entropy (Mu et al., 2016), AR model (Campisi et al., 2011;Maiorana et al., 2016;Rocca et al., 2012).These researches were made to apply the AR model for EEG biometrics.For example, Poulos (Poulos et al., 1999) used the AR model to achieve 80 to 95 percent accuracy with 4 subjects.Paranjape (Paranjape et al., 2001) used AR and variance/covariance matrices with statistical tools to model the EEG signal from a single channel.The experiment employed 40 subjects in the resting state with eyes opened and eyes closed.The accuracy percentage was only 80 percent.Yazdani (Yazdani et al., 2008) employed the AR model, the peak of power spectral density (PSD) for feature extraction, and LDA for dimension reduction to improve higher accuracy.The resulting accuracy percentage reached 100 with 20 subjects.Riera (Riera et al., 2008) introduced a new feature extraction approach using combinations of five models, namely AR, Fourier Transform, Mutual Information, Coherence, and Cross Correlation.They applied Fisher's Discriminant Analysis as classifier in this study.There were 51 subjects and 36 intruders.The accuracy percentage ranged from 87.5 to 98.1 percent.Campi (Campisi et al., 2011) used reflection coefficients of the AR model to yield 96.08% accuracy with 48 subjects.The research of Rocca (Rocca et al., 2012) (Maiorana et al., 2016) experimented and reported that using reflection coefficient of AR method gave a discriminating capability higher than using the PSD and COH as feature extraction methods.The results yielded higher identification accuracy when comparing to other features.Thus, the coefficient was considered as the main feature in this work.

Autoregressive Model (AR)
Autoregressive model (AR) is a representation of a signal using its own previous values.It specifies that the present output is a linear combination of the previous z outputs combining with white noise.In general, AR(z) denotes an autoregressive model of order z, which is defined by Eq. ( 3).
(3) where x t is the output signal of the sample point t, θ i is the coefficient of AR, z is the order less than t, and n t is a white noise input.The coefficient θ can be determined by Yule-Walker and solved by the Levinson method to obtain the reflection coefficient of the AR model (Maiorana et al., 2016).Anyhow, AR with the Burg method (Bos et al., 2002) is a popular approach for evaluating the AR coefficient and reflection coefficient since it yields a high resolution for small data points.The Burg method can evaluate the coefficient by first estimating the reflection coefficient defined as the last autoregressive parameter estimate for each model of z.Details of the Burg method can be found in de Hoon et al. (1996).Since Burg method can compute the reflection coefficients directly from signal x t , this study adopted this method to compute reflection coefficients of each channel as the main feature for the proposed biometric identification.

k-Nearest Neighbor (k-NN)
k-Nearest Neighbor is a simple classifier with respect to calculating the distance to all trained data sets ( ) and test data set.Each test data (V) is predicted as belonging to a class or a subject which is in the trained set having the shortest Euclidean distance.The decision is taken in accordance with the majority voting rule.The identity of a class depends on the number of occurrences from the votes (k).The Euclidean distance can be expressed by Eq. ( 4). (4)

o = Ab
(1) (2) where d Euclid denotes the minimum distance between the test data set and the trained data set, n is the number of features, V is the vectors in the test data set or the specific data set, u is the number of classes, T is the number of samples in the trained data set.The correct recognition rate (CRR) is used to measure the accuracy of classification.In essence, it represents the true positive numbers of each class.CRR is defined by Eq. ( 5) and Eq. ( 6) as follows: (5) where f denotes the number of folds for cross validation, u is the number of subjects, P (i) is the number of correct recognitions of each class or subject, that is, the number of true positives of each class.Eventually, each CRR f is averaged to be CCR, that is,

METHODOLOGY Data Collection
Data of the EEG signals were obtained from the prior research data set of Tangkraingkij.These EEG signals were recorded by the Chulalongkorn Comprehensive Epilepsy Program (CCEP) of King Chulalongkorn Memorial Hospital.There were 20 normal subjects, eight men and twelve women, tested in the resting state, motionless, and performed no task during recording.The ages of all the subjects ranged from 12 to 40 years.Recording was based on the 10-20 international system defining the location of scalp electrodes.
During recording, the Mastoid area A1 and A2 were electrically linked and used as reference with the mono-polar montage.The EEG amplifier was the Grass model 8 plus.The sampling rate was 200 Hertz.The EEG signals were digitized and notch filtered at 50 Hertz by the BMSI board using the Stellate Harmony EEG software.The digitized EEG data were exported as EDF ( ) 1 For each channel of each subject, 3,000 data points was collected in 15 seconds.This study selected four channels, namely F4, C4, P4, and O2 as depicted in Figure 1 based on prior work (Tangkraingkij et al., 2013) to establish the relevant locations for classification having the highest average accuracy of 98.51%.Moreover, the selected channels were positions of the brain having the biological functions that were essential for person identification.

Proposed Method and Experiment Setup
This study conducted 2 approaches.The first approach was called the modified AR method, consisting of 4 processes as shown in Figure 2. The first process obtained four selected channels EEG data filtering in the frequency range of 0.5-40 Hz.Subsequent processes were data segmentation, feature extraction with the AR model, and classification.The second approach was the proposed approach called ICA with the AR method, consisting of 5 processes as shown in Figure 3.The first process applied ICA using the SOBIRO algorithm to the EEG data set.The remaining four processes were the same as those of the modified AR method.Both approaches were experimented and compared with those of Tangkraingkij et al. (2013).//Step 3: Generate matrix V u for segmentation

end for
The following set ups were established to accommodate the experiment.Each process is as follows.

ICA
The objective of ICA is to clean and separate individual EEG signals from those obtained at other locations of the brain.All channels of each subject were processed with this method.This process is shown in line 4.

9.
  n is the number of data points segment.
11. //Step 4: Generate order z of AR using Burg method 12. for z =1 to 60 do 13.In Figure 5, we compared the highest accuracy from varied lengths of each segment.Based on the limited number of data points per segment, 128 and 256 data points were used instead of 100, 200, and 300 data points so that the number of experiments would be reduced, yet still maintained relatively equivalent identification coverage.Moreover, with 128 and 256 data points, the highest accuracy with the lower order of the AR model was obtained as shown in Tables 1 and 4. 1, 3, and 5 according to the experimental parameter setting.This ensured reliable generalization of the independent data set.All results from each fold are averaged to obtain the final results, which in turn was compared with Tangkraingkij's results (2013).

EXPERIMENTAL RESULTS
The experiments for the authentication process for this work comprised of the following.Four channels were used for classifying 20 subjects.The k-NN classifier was set at 1, 3 and 5. Validation of the classifier used in this experiment was 5-fold and 10-fold cross validation.The number of data points per segment (100, 200, 300, 128, and 256) was evaluated according to the modified AR method and ICA with the AR method.The results are shown in Figure 5.
Table 1 shows the accuracy of ICA with the AR method using 5-fold and 10fold cross validation.The highest accuracy of 5-fold and 10-fold appears in the 39 th order of the AR model which were 97.22 and 97.66 percent, respectively, where k was equal to 1.It is apparent that the 10-fold cross validation reaches higher accuracy than that of the 5-fold.Table 2 shows the accuracy of the AR method with 5-fold and 10-fold cross validation.The highest accuracy of 5-fold and 10-fold were 95.77 percent at the 32 nd order and 96.79 percent at the 30 th order.It can be seen that the 10-fold cross validation with k=1 still reaches higher accuracy than other values.
In Figure 6, the results of both approaches are different at the lower order.Apparently, ICA and AR seem to filter out unwanted noise better than the AR without ICA.The experimental results of Tangkraingkij showed that four channels, F4, C4, P4, and O2 with 1,000 data points could identify 20 subjects with 98.50 percent accuracy.The comparisons are shown in Table 3.The results of both proposed methods in Table 3 are slightly lower than those of Tangkraingkij.While Tangkraingkij used 1000 data points, we used only 128 data points which yielded as good as those of Tangkraingkij.For 256 data points, the results are shown in Table 4.
Table 4 shows the accuracy comparison of ICA with the AR method and only the AR method for 10-fold cross validation.The highest accuracy of each method was 99.78 and 99.29 percent, respectively, and k was equal to 1.It is apparent that the ICA with the AR method yields higher accuracy than only the AR method.As the order of AR increased, the two methods tended to became consistently accurate.The higher the order, the higher the accuracy became.The overall results are depicted in Figure 7.
16 at the 30 th order.It can be seen that the 10-fold cross validation with k=1 still reaches higher accuracy than other values.

Filter
Brain waves were divided into five bands of frequency resulting in different brain wave activities.Therefore, all five groups were merged and filtered in the frequency range of 0.5 to 40 Hz.This process is shown in line 8.

Data Segmentation
The aim of this process is to find a faster personal identification process by using short EEG data length.Each EEG signal from every channel of the 20 subjects is segmented into n data points, namely 100, 200, 300, 128 and 256 data points (or window size).To improve identification accuracy, each window was placed to overlap one another by 50% without differentiating patterns.Such a provision turned out to benefit less number of data segments for the experiment.In other words, the data length of 128 and 256 data points provided similar coverage of what the combined 100, 200, and 300 data points could do.Furthermore, the overlapping data length provision enhanced the number of feature vectors of classifications.This process is shown between lines 14 and 15.When n = 128 data points with 50 percent overlapping of the length, there were be 45 segments per channel.After processing with the AR model, the number of data points were be ordered and reduced based on the AR method.All segments with the same order from the 4 channels were concatenated resulting in 45 samples per subject.If n = 256 data points with the same 50 percent overlapping of the length, there would be 29 segments per channel which would result in 29 samples per subject.

Feature Extraction
This process extracts the features using the reflection coefficient of the AR model with the Burg method (Maiorana et al., 2016).The process was conducted by increasing the order of the AR model from 1 to 60 for each segment with n data points.In the AR model, the number of reflection coefficients was dependent on the order of the AR model.This process is shown between lines17 and 20.For example, AR(10) yielded ten reflection coefficients from each segment.These reflection coefficients from the segments with the same order from 4-selected channels were concatenated to be a new sample as shown in Figure 4.

Classification
In this study, k-NN was chosen as the main classifier along with the 5-fold and 10-fold cross validation for evaluating the classification.k was set to  The AR model order was in respect to the number of features used in the classification.The higher the order is, the more features were used.Having more features increased the complexity of the system but also tended to produce more accurate results.All in all, a low order system that still maintained high accuracy was ideal.It can be seen in Figure 7 that ICA with the AR method yielded excellent classification accuracy at order 11 and 44 with 99.32 and 99.78 percent, respectively.The first high accuracy can be considered as a practical implementation because the AR order of each channel can be reduced to 11.The experiment revealed that there were artifacts in EEG signals which perturbed the classification accuracy.By using the ICA method, we can removed the artifacts embedded in the EG signals, thereby resulting in higher accuracy.Table 5 represents the comparison of the highest accuracy of the three comparable methods.The highest accuracy of Tangkrainkij's results was 98.50 based on ICA utilizing 1,000 data points in each of the four selected channels.Our AR only method outperformed the prior method by using only 256 data points with AR order 42, reducing from 256 to 42 features per each of the four selected channels.Furthermore, the proposed ICA with the AR method based on AR order 11, reducing from 256 to 11 features per each of the four selected channels yielded the best accuracy at 99.32 percent for AR order 11.Higher accuracy can be reached for AR order 44 at 99.78 percent.The proposed ICA with the AR method has a lower standard deviation than other methods at the same order.The lower the standard deviation is, the more reliable accuracy is obtained.

CONCLUSION AND FUTURE WORK
The study focused on applying two approaches to four channels of EEG, F4, C4, P4, and O2, to identify 20 subjects in the resting state in order to improve higher person identification accuracy, namely ICA with the AR method and the modified AR method.The experimental results showed that ICA with the AR method yielded higher accuracy and lower standard deviation than using the modified AR method only with 5-fold and 10-fold cross validation.The experiment also revealed that the highest accuracy of 256 data points on both methods, utilized less data points than those of Tangkraingkij.Moreover, the number of features was reduced from 256 to 11 data points at 99.32% accuracy, while the highest accuracy (99.78%) used 44 features that were still less than those of Tangkraingkij's results.It was found that the accuracy of the ICA and the AR method was higher than modified AR method at low order.The results confirmed that by using ICA preprocessing and applying the AR model could clean EEG signals and enhance classification accuracy.Future improvement can explore 16-channel combinations to discover the significance of channel effects on subject classification accuracy.
(ICA) is a technique of blind source separation (BSS) which divides multivariate signals into individual source signals or components that are statistically independent.ICA can be expressed by Eq. (1) (1) where o denotes the mixture vector of mixing between the source signals b and unknown mixing matrix A. This technique can be set up in EEG as follows: let o be the EEG recorded from the n-electrodes on the scalp, i.e. o = [o 1 o 2 o 3 ….o n ] T , let b = [b 1 b 2 b 3 ….b n ] T be the source EEG signals.Both constitute an EEG vector of n signals at time t.Since the value of each b i and mixing matrix A are unknown, the known value o i is multiplied by the inverse of A, denoted by D, to obtain an estimated source EEG signals y of b.The derived EEG signal becomes (2)

Figure 5 .
Figure 5. Results of the highest accuracy based on each varied length of data points per segment (a) 100 data points (b) 200 data points (c) 300 data points (d) 128 data points and (e) 256 data points.

Figure 4 .
Figure 4. Data segmentation of EEG signal and feature extraction based on AR model.

Figure 6 .
Figure 6.Comparison of classification accuracy between ICA with AR and only AR model of 128 data points, 4 channels, and 10-fold cross validation.
used three electrodes with sub-bands of EEG signals to classify 45 persons in the resting state with closed eyes.The AR stochastic and polynomial approaches were used for feature extraction in this study.Recognition rate was about 98.73 percent with 45 persons in the resting state.Maiorana

Table 3
Comparison

Table 4
Classification Accuracy of 20 Subjects with 256 Data Point Length Performed on the ICA with the AR Method and Only the AR Method for 10-fold Cross Validation with k-NN having k=1 Figure 7.Comparison of classification accuracy between ICA with AR and AR method with 256 data points, 4 channels, and 10-fold cross validation.
Figure 7.Comparison of classification accuracy between ICA with AR and AR method with 256 data points, 4 channels, and 10-fold cross validation.

Table 5
Comparison of the Highest Accuracy for 4 Channels at 256 Data Points with Tangkraingkij's Results