METHOD

Improving eye-brain-computer interface performance by using EEG frequency components

Shishkin SL1, Kozyrskiy BL1,3, Trofimov AG1,3, Nuzhdin YO1, Fedorova AA1, Svirin EP1, Velichkovsky BM2
About authors

1 Department of Neurocognitive Technologies, Kurchatov Complex of NBICS Technologies,
National Research Centre Kurchatov Institute, Moscow, Russia

2 Kurchatov Complex of NBICS Technologies,
National Research Centre Kurchatov Institute, Moscow, Russia

3 Faculty of Cybernetics and Information Security,
National Research Nuclear University MEPhI, Moscow, Russia

Correspondence should be addressed: Sergey Shishkin
pl. Akademika Kurchatova, d. 1, Moscow, Russia, 123182; ur.liam@nikghsihsgres

About paper

Funding: this work was partially supported by the Russian Science Foundation, grant no. 14-28-00234 (acquisition and preprocessing of experimental data), and the Russian Foundation for Basic Research, grant no. 15-29-01344 (evaluation of wavelet features significance for classification).

Received: 2016-04-08 Accepted: 2016-04-15 Published online: 2017-01-05
|

Brain-computer interfaces (BCIs) are systems for operating computers and other devices connected to them that make use of the detection of brain activity patterns associated with control commands. They have been designed primarily to assist paralyzed patients [1, 2, 3]. At the same time, the accuracy and operating speed of a vast majority of BCIs are still low. It is unclear if BCIs can be used outside the range of tasks where it is sufficient to issue very simple commands but important that these commands be given “straight from the brain” (for example, in poststroke rehabilitation [4]). Using BCI, a satisfactory spelling rate (50 characters per minute in healthy individuals) was achieved only in the recent study [5], in which rhythmic visual stimulation was used; it is still unclear if the use of such stimulation in BCI is safe.

Interestingly, all non-invasive BCIs with high accuracy and high speed rates utilize EEG response to visual stimuli the user directs his or her gaze at. It means that they can be used only if a patient does not suffer from any serious vision impairment or eye movement disorders and still has the ability to voluntarily direct his gaze towards specific screen areas associated with control commands (to fixate the gaze on virtual “buttons”). When this is the case, however, it is possible to control computers and other devices connected to them by detecting gaze direction using eye tracking (video-oculography).

Current methods of gaze-based control demonstrate relatively good accuracy, speed and usability when used for text entry [6]. However, attempts to apply them to a wider range of tasks are hampered by the so-called “Midas touch” problem [7]. Just like King Midas from the ancient Greek myth turned all things into gold by touching them, technical devices are non-selective in translating gaze fixations or eye movements into commands: their user issues commands even without an intent to issue them. This is because eye movements are a crucial component of visual function and are normally spontaneous, slipping conscious control easily even if attention is focused on them. Current solutions to this problem either make the control process very slow and tiring or can be used for a limited range of tasks.

As early as 1996, it was proposed to solve the Midas touch problem and create a high-performance universal interface by combining “eye-mouse” control with BCI [8]. Over a number of years the combination of those two technologies [9] was quite mechanical in nature and did not result in creating systems with fast response and good ergonomic properties. An innovative solution was suggested by Torsten Zander’s group who turned to the idea of natural combination of eye tracking and BCI [8] within the framework of a new trend, namely, the development of the so-called “passive BCIs”. This name was given to BCIs that responded to patterns of brain activity unrelated to deliberate efforts to issue a command using BCI [10]. Zander and his colleagues showed that eye fixations used for control (“control” fixations) can be differentiated from spontaneous (visual) fixations using the encephalogram (EEG) recorded during fixations, even if control markers appearing on EEG were not evoked intentionally (the subjects were not given additional tasks and were not presented with stimuli in the “control” position) [11]. However, in their study control could be implemented only by a long (1,000 ms) gaze fixation on a single screen target.

Our group has developed a method for an eye-brain- computer interface (EBCI) that allows for EEG–based classification of shorter fixations with a duration of 500 ms. In our experiment, subjects played Lines, a computer game, with their gaze only. Each move was made by fixating the gaze on one of 50 elements on the board. The classifier was trained to differentiate between the EEG signals recorded during those fixations and EEG signals recorded during fixations on the same elements but with control switched off, i. e., during supposedly spontaneous fixations [12; Shishkin et al., in prep.]. Due to the reduction of fixation length, subjects perceived control as natural and comfortable. The number and location of control-sensitive visual elements in our method was limited by eye tracker capacities only. However, fixation-related amplitude features of the EEG components (we used those features in our early works) did not provide sufficient control detection accuracy for practical application of the technology.

In this study we analyze the possibility to improve the accuracy of the EBCI classifier that automatically differentiates between control gaze fixations and spontaneous ones by using features of oscillatory EEG components in addition to EEG amplitude features. Since short EEG intervals should be used, during which both amplitude and frequency components can display time dependency, and because of the high dimensionality of time-frequency data and other significant differences between them and amplitude data, it was necessary to develop a special scheme for extracting quantitative parameters of EEG components recorded during gaze fixations.

METHODS

The experiment

We used EEG recordings obtained in our early experimental study. Its results will be presented in another article [Shishkin et al., in prep.]; the article will also provide a detailed description of the methods used in the experiment.

Our study was conducted in compliance with the guidelines of the Declaration of Helsinki. The study enrolled 8 relatively healthy individuals (7 male and 1 female) aged 21–48 (mean age was 29). The subjects gave their informed consent. Gaze was recorded using EyeLink 1000 Plus eyetracker (SR Research, Canada). Fixations were detected on-line using variance criterion. Sinchronously, EEG from 19 electrodes (Fz, F3, F4, Cz, C3, C4, Pz, P1, P2, P3, P4, POz, PO3, PO4, PO7, PO8, Oz, O1, O2) and electrooculogram (EOG) were recorded using the actiCHamp system (BrainProducts, Germany). EOG was used to monitor EEG artifacts. Gaze direction, EEG and EOG were recorded at 500 Hz frequency.

Gaze-based control algorithms and the task the subjects performed were exactly the same as described in our preliminary study [12]. Here, only the most important details are listed. The subjects played Lines, a computer game that was modified so that all moves during the game could be performed by a sequence of 3 fixations, each exceeding a 500 ms duration threshold. Each sequence started with the fixation on a particular screen area, where a special “control on” indicator appeared after the threshold had been reached. EEG recorded during those fixations constituted the first class of data (control fixations). Another data class (non-control fixations) was constituted by EEG recorded during fixations that also exceeded the threshold but did not result in a move, according to game rules. Fixation-based game control, EEG/ EOG synchronization and recording of gaze fixation data were performed using the original software.

An average of 155 (from 120 to 184) control and 159 (from 114 to 208) non-control fixations was recorded for each subject.

Feature extraction

To extract EEG wavelet features, we chose the interval 50– 500 ms after fixation onset, because the preceding interval contained artifacts related to gaze shifts, and the subsequent interval could not be used for detecting the intention to issue a command in the on-line mode. In the analyzed interval there were almost no artifacts, so we did not apply any procedures for their correction or removal. In our early work we showed [12; Shishkin et al., in prep.] that in our EBCI paradigm, a considerable difference in EEG amplitudes between control and non-control fixations was typical for the second half of fixation interval only. Therefore, we used the interval 200–500 ms after fixation onset to obtain amplitude features in the current study.

Amplitude features were obtained by averaging amplitude values in each EEG channel separately in overlapping 50 ms windows. To reduce the influence of slow oscillations and direct current component, the baseline was corrected by subtracting the mean value for the interval 200–300 ms after fixation onset from those averaged values. The obtained “raw” amplitude features constituted a feature vector that characterized a trial corresponding to one fixation.

Wavelet features were obtained using Morlet wavelet transform. The scale range corresponded to the frequency range of 5–30 Hz. The higher frequency corresponded to the scale, the more wavelet coefficients were used to describe the trial. To reduce noise produced by irrelevant features, only 30 % of the wavelet time-frequency features were used, namely those that differed most considerably between spontaneous and control fixations (those that had the highest coefficient of determination, R2).

Selected features were processed using Principal Component Analysis (PCA). It was applied separately to amplitude features and wavelet features. 80 components with highest variance for each feature type were selected. They constituted new sets of features. Before and after PCA, z-score normalization was applied either to all values of each feature (in all trials) or to all features within a single trial (for amplitude features and wavelet features separately). Normalization within a single trial was considered a way of adaptation to local feature level that could gradually vary over time.

EEG-based classification of control and non-control fixations

For classification, linear discriminant analysis with shrinkage regularization was used. It ensured effective training with small training sets (like the one that was available in this study) even if feature dimensionality was relatively high; it also proved to be highly effective in the BCIs based on event-related potentials [13, 14].

Classification quality was assessed using 5-fold cross- validation. Classifier training, feature selection, calculation of mean values and standard deviations for feature normalization (in case it was applied trialwise), as well as dimensionality reduction, were carried out on the data used as the training set. The derived feature selection rule, mean and standard deviations for corresponding value sets, weight matrix for selected components, and weight of the trained classifier were applied to the rest of data regarded as a test sample. Due to such arrangement of cross-validation, it was possible to reconstruct a real situation of how a classifier can be used online in a BCI.

As a classification quality metric, we used AUC (Area Under Curve; here, “Curve” refers to the Receiver Operating Characteristic (ROC) curve), an integral performance index widely applied in similar studies. It shows to what extent classification results differ from random for various classifier threshold values that can be selected to separate classes with various ratios of various types of errors, depending on the specific purpose the classifier is used for. If classification results do not differ from random guess, AUC value goes to 0.5; if the classifier does not make any errors, AUC equals to 1. To compare AUC values in case of various feature sets, multivariate analysis of variance (MANOVA) and Bonferroni post hoc test were applied using Statistica 7.0 software (StatSoft, USA).

RESULTS

With all methods of feature extraction, individual values of classification accuracy (AUC) were above 0.5, group mean was no less than 0.66; however, AUC mean values were considerably different (fig. 1).

3-way MANOVA (see table below; all three factors were with repeated measures) applied to individual AUC values showed that classification accuracy was dependent on the feature set factor (λ = 0.06, F(2.6) = 49, p = 0.0002), while the effects of other factors and interaction between factors in all their combinations were not statistically significant. Benferroni post hoc test showed that the difference between amplitude and amplitude-wavelet feature sets was statistically significant (p = 0.006); no statistically significant difference between amplitude and wavelet (p = 0.34) and between wavelet and amplitude-wavelet (p = 0.16) feature sets was found. The set that consisted of amplitude features only had the lowest classification accuracy. The best results were shown by the combined set (amplitude and wavelet features grouped together). With the combined EEG feature set, AUC group mean increased by 0.05–0.08 (depending on the method used for normalization) compared to the amplitude set. AUC group mean was 0.75 ± 0.04 (M ± SD) with features normalized both before and after PCA, and 0.75 ± 0.06 with features normalized before PCA and within trials after PCA.

Fig. 2 shows individual results for the feature extraction method that resulted in the highest group averaged AUC. Individual curves on the graph provide values of various types of errors that could be observed with various classification threshold values. Specifically, of particular importance is EBCI classifier sensitivity, i. e., the rate of correctly identified control fixations, under the condition of low false positive rate. As shown in fig. 2, when fixating false positive rate at 0.1 (which can be achieved by selecting the corresponding classifier threshold using a separate set); only one subject demonstrated sensitivity lower than 0.2, while another subject had sensitivity above 0.5 and the rest scores were in the interval between those two values.

DISCUSSION

Improvement of classifier performance is the key factor in the development of an EBCI that could detect relatively short control gaze fixations using EEG intervals recorded during such fixations, as only single signal intervals with the duration of a few hundred milliseconds are available for analysis in such a BCI paradigm.

Quality of classification with low level of false alarms should be discussed separately. In EBCI, it is easy to provide a safety net in case control fixation is not identified. If the interface does not respond when the threshold of 500 ms has been reached, the users can continue fixate their gaze, and the system will respond after the additional (for example, 1,000 ms) threshold has been reached, even without the response from the EEG classifier. We can make a supposition that with the EBCI that has this kind of safety net, the brain of the user interested in speeding up interface activation can learn to produce the EEG pattern that accompanies control fixations and ensures a considerably more frequent response of the classifier. However, for that a minimum entry-level control is necessary. As fig. 2 demonstrates, the scheme for signal preprocessing and feature extraction developed by the authors of this work would help some subjects evoke a faster interface response in half of control fixations with relatively low false alarm rate (0.1)

While we already can speculate on the nature of amplitude features that can be used for classification in our EBCI, assuming that they might be related to the presence of negative potential associated with feedback expectation in case of interface response [Shishkin et al., in prep.], the nature of wavelet features still requires further elucidation. It should be noted that patterns of EEG frequency components typical for various brain states are highly individual and their specifics can be only partially observed on the group level. However, they can

be successfully classified if the classifier is trained on individual data, in particular in the BCI paradigm [15, 16, 17, 18]. Still, high dimensionality of such data requires an especially elaborated approach to different stages of analysis, with a larger number of subjects involved in such studies whenever possible. We have just made our first steps in this direction, but similar results obtained with various methods of data normalization may indicate a relatively high robustness of the proposed scheme for data preprocessing and informative features extraction, and its good prospects for the EBCI development.

CONCLUSIONS

In this work we made the first attempt to use the spatiotemporal EEG representation, i. e. representation of EEG frequency components as a function of time from the fixation onset. The use of these features allowed us to achieve classification accuracy at least as good as classification accuracy based on amplitude features we used in previous works. Moreover, a combination of both feature sets led to classification accuracy improvement. We believe that further improvement of computation methods will allow us to closely approach a practical application of eye- brain-computer interfaces that combine the main advantages of standard BCIs and control systems based on eye tracking.

КОММЕНТАРИИ (0)