Decoding EEG Data with Machine Learning

What it is:

A decoder of EEG brain signals using machine learning.

What we did:

Built a configurable training pipeline supporting multiple neural network architectures (CNN, Hybrid CNN-Transformer, Dual-Stream, EEGNet variants, and T-TIME)
Created automated hyperparameter optimization using Optuna with comprehensive search spaces
Decoded brain data, results suggest distinct neural signals for different experimental conditions
View project on GitHub

Results for the 'landing_digit' task (6-class). The model correctly predicted a plurality of trials for landing digits 1, 3, 4, 5, and 6, well above the 16.67% chance level. This suggests that distinct brain patterns for these conditions are present in the EEG data.

Decreasing Minus 1 Task CNN Confusion Matrix

Performance on the 'decreasing minus 1' task. The model successfully distinguished between conditions 21, 54, and 65 with correct plurality predictions. Chance is 16.67%.

Full Leave-one-subject-out (LOSO) results for the 'landing_digit' task using the dual_stream model. This run achieved an overall accuracy of 27.2%, well above the 16.67% chance level.

Full LOSO results for another run of the 'landing_digit' task with the dual_stream model. This run achieved 26.1% accuracy (chance is 16.67%) and correctly predicted a plurality for landing digits 1, 3, and 5.

CNN performance on a binary task discriminating between increasing vs. decreasing number pairs. The model correctly identified decreasing pairs 74.6% of the time and increasing pairs 60.4% of the time. Chance is 50%.

CNN confusion matrix for the 'decreasing by 1' task (5 classes). With a 20% chance rate, the model achieved a plurality of correct predictions for trials landing on 1 (31.4%), 3 (24.3%), 4 (22.4%), and a high accuracy for landing on 5 (60.4%).

Binary classification for the 'landing on 1' task. The model correctly identified when the stimulus was *not* 1 (73.0% accuracy) and when it *was* 1 (64.4% accuracy). Chance is 50%.

CNN performance on the 'increasing by 1' task (5 classes). The model correctly identified 'land 3' (32.1%), 'land 4' (25.7%), 'land 5' (43.6%), and 'land 6' (32.3%) above the 20% chance level.

A separate run of the 'decreasing by 1' (5-class) task. The model achieved a plurality of correct classifications for trials landing on 1 (27.7%), 3 (22.2%), 4 (31.2%), and 5 (33.3%). Chance is 20%.

Decoding performance on the full 'number pairs' task (24 conditions). With chance level of ~4.1%, model achieved mean accuracy of 6.4%. Linearity along diagonal suggests model is capturing relationship between number pairs from EEG data.

Task: landing_digit. Training and validation curves for fold 21 of a dual_stream trial. Chance is 16.67

Task: landing_digit. Leave-one-subject-out fold 21 confusion matrix of a dual_stream trial. Chance is 16.67.

Read Full Project Details...

Decoding brain data

WE attempt to decode EEG brain signals using machine learning. The data is from projects at the Language and Cognitive Neuroscience Lab, Teachers College, Columbia University (Tang, 2022)

Approach

We built a machine learning system that can analyze EEG data and classify the numerical processing a subject. We experimented with multiple neural network architectures:

Raw EEG Processing:

CNN (Convolutional Neural Networks): Direct processing of time-series EEG data
EEGNet variants: Specialized architectures designed for EEG, including models with squeeze-and-excitation blocks
Hybrid CNN-Transformer: Combines convolutional processing with attention mechanisms
Dual-Stream: Processes both raw time-series and frequency-domain representations

Advanced Adaptation:

T-TIME (Test-Time Adaptation): Allows models to adapt to individual subjects in real-time

Experimental “Tasks”

We tested our system on subjects’ data. The following are our “tasks”:

Landing Digit Task: When participants were primed with any number and the stimulus number of dots is 1, 2, 3, 4, 5, or 6
Decreasing Minus 1 Task: When participants were primed with a number followed by a stimulus one number lower
Increasing or Decreasing: When participants were primed with a number number followed by a lower or higher number

Findings so far

The confusion matrices show that our models can distinguish between different numerical conditions above chance level, indicating that unique neural signatures exist for different types of numerical processing.

Technical

We have a unified training pipeline with:

Leave-One-Subject-Out cross-validation for robust evaluation
Automated hyperparameter optimization using Optuna
Standardized reporting and visualization across all model types
GPU acceleration (CUDA)

Future

We will make “tasks” for ratio, odd/even parity, and more. The framework extensible, as we can add new tasks, models, and analysis methods.