BCI.jl

This module implements machine learning for EEG data and, particularly, brain-computer interface data, using Riemannian geometry. Data encoding is achieved by estimating a form of covariance matrix for the EEG epochs or BCI trials.

For tutorials, see Machine Learning.

Resources for Covariance Matrices

PackageDescription
PosDefManifold.jllow-level manipulation of covariance matrices using Riemannian geometry
Diagonalizations.jlmethods based on the diagonalization of one or more covariance matrices
FourierAnalysis.jlFourier cross-spectra and coherence matrices (special forms of covariance matrices)

Methods

FunctionDescription
Eegle.BCI.covmatmany covariance matrix estimators (2 methods)
Eegle.BCI.encodeencode all trials in a given EEG recording using Riemannian geometry
Eegle.BCI.crvalperform a cross-validation of a Riemannian machine learning model on a BCI session

📖

Eegle.BCI.covmatFunction
(1) function covmat(X::AbstractMatrix{T}, τ::Int = 0;
        covtype = LShrLW,
        prototype::Union{AbstractMatrix, Nothing} = nothing,
        standardize::Bool = false,
        lags::Int = 0,
        useBLAS::Bool = true,
        threaded::Bool = true,
        reg::Symbol = :rmt,
        tol::Real = real(T)(1e-6),
        maxiter::Int = 200,
        verbose::Bool = false) 
    where T<:Union{Real, Complex}

(2)  function covmat(𝐗::AbstractVector{<:AbstractArray{T}};
        < same arguments as method (1) > ...)

Covariance matrix estimation(s) of:

  • (1): a single data matrix (e.g., a trial) X
  • (2): a vector of $K$ data matrices 𝐗.

Arguments

  • (1) X: $N×T$ real data matrix, where $N$ and $T$ denotes the number of samples and channels, respectively
  • (2) 𝐗: a vector holding $k$ such matrices.

** Arguments ** If τ is a positive number, the non-centered lagged (auto)covariance is computed. In this case only the prototype, standardize and threaded keyword arguments here below apply. The default τ is 0.

Optional Keyword Arguments

  • covtype: covariance estimator (default = LShrLW):
  • prototype: optional matrix to be stacked to the data matrix (or matrices) to form a super-trial — see Appendix I in (Congedo et al., 2017). Default = nothing
  • standardize: if true, standardize the data matrix(ces) (global mean = 0 and global sd = 1 for each matrix) before estimating the covariance. Default = false
  • lags : if > 0, embed lags lags by calling the Eegle.Preprocessing.embedLags function. No lags are embedded by default.
  • useBLAS: optimize the SCM covariance computations using BLAS. Default = true
  • threaded: enable multi-threading across the $k$ matrices. For (2) only. Default = true
  • Only for M-estimators:
    • tol: the tolerance for the stopping criterion of the iterative algorithm (default = 1e-6)
    • maxiter: the maximum number of iterations allowed (default = 200)
    • verbose: if true, information about convergence is printed in the REPL (default = false).
  • Only for the normalized regularized Tyler's M-Estimator. Default = :rmt:
    • reg: if it is :rmt, the random matrix theory shrinkage is used, otherwise the Ledoit and Wolf linear shrinkage is used.

Return

If τ > 0:

  • (1): The covariance matrix estimation as a generic Julia Matrix object.
  • (2): A vector of $K$ covariance matrix estimations as a vector of generic Julia Matrix objects.

If τ = 0 (default):

  • (1): The covariance matrix estimation as a Julia Hermitian matrix
  • (2): A vector of $K$ covariance matrix estimations as an HermitianVector type.

Examples

using Eegle # or using Eegle.BCI

# Method (1)

C = covmat(randn(128, 19)) # linear shrinkage estimator by default

C = covmat(randn(128, 19); covtype=SCM)

C = covmat(randn(128, 19); covtype=:Tyler)

C = covmat(randn(128, 19), 1) # lagged autocovariance matrix

# not to be confused with the covariance matrix of lag-embedded data:
C = covmat(randn(128, 19); covtype=SCM, lags = 1) 

# Suppose you want the SCM of the forward first-order differences.
# We can then use the standard `diff` Julia function:
C = covmat(diff(randn(128, 19); dims=1); covtype=SCM) 

# et cetera.

# Method (2)

𝐗 = [randn(128, 19) for k=1:10]

𝐂 = covmat(𝐗)

C = covmat(𝐗; covtype=SCM)

C = covmat(𝐗; covtype=:Tyler)

## using an example file in NY format provided with Eegle:
## read a P300 BCI session, extract the trials and
## compute the covariance matrices of all using 
## standard settings for P300 BCI data.
C = covmat(readNY(EXAMPLE_P300_1; bandPass=(1, 24), upperLimit=1.2).trials)
source
Eegle.BCI.encodeFunction
    function encode(o::EEG;
        paradigm::Symbol = o.paradigm,
        covtype = LShrLW,
        targetLabel::String = "target",
        overlapping::Bool = false,
        weights = :a,
        pcadim::Int = 8,
        standardize::Bool = false,
        lags::Int = 0,
        tikh :: Union{Real, Int} = 0,
        useBLAS :: Bool = true,
        threaded = true,
        reg::Symbol = :rmt,
        tol::Real = 1e-6,
        maxiter::Int = 200,
        verbose::Bool = false)

Encode all trials in an EEG data structure as covariance matrices according to a given BCI paradigm. This is used in Riemannian geometry machine learning. The supported BCI paradigms are Motor Imagery (MI), Event-Related Potentials (ERP) and P300. For details, see Appendix I in (Congedo et al., 2017).

Arguments

  • o: an instance of the EEG data structure containing trials and metadata.

Optional Keyword Arguments

  • paradigm: BCI paradigm, either :ERP, :P300, or :MI. By default it is used the paradigm stored in o.
    • for :ERP, prototypes for all classes are stacked and covariance is computed on super-trials
    • for :P300, only the target class prototype is stacked
    • for :MI, no prototype is used; covariance is computed on the trial as it is.
  • covtype, standardize, lags, useBLAS, reg, tol, maxiter and verbose — see Eegle.BCI.covmat, to which they are passed.
  • targetLabel: label of the target class (for P300 paradigm only). By default is "target", following the conventions of the FII BCI corpus.
  • overlapping: for prototype mean ERP estimations (ERP/P300 only). Default = false:
    • if true, use multivariate regression
    • if false, use the arithmetic average — see mean.
  • weights: weights for prototype mean ERP estimations (ERP/P300 only). Default = :a — see mean
  • pcadim: number of PCA components of the prototype. They replace the prototype (ERP/P300 paradigms only, default = 0, which does not apply PCA)
  • standardize: standardize trials and prototype (global mean 0 and sd 1) before covariance estimation (default: false)
  • tikh: Tikhonov regularization parameter (0, the default, does not apply regularization). It is applied after covariance estimation
  • threaded: enable multi-threaded covariance estimations across trials (default: true).

Return A vector of $k$ covariance matrix estimations as a HermitianVector type.

Examples

using Eegle # or using Eegle.BCI

# EXAMPLE_P300_1 is an example file provided by Eegle

o = readNY(EXAMPLE_P300_1; bandPass=(1, 24), upperLimit=1.2)

C = encode(o)
source
PosDefManifoldML.crvalFunction
    function crval( filename    :: AbstractString, 
                    model       :: MLmodel = MDM(Fisher);
            # Arguments passed to both encode and crval
            verbose     :: Bool = true,
            threaded    :: Bool = true,
            # Arguments passed to readNY
            toFloat64   :: Bool = true,
            bandStop    :: Tuple = (),
            bandPass    :: Tuple = (),
            bsDesign    :: DSP.ZeroPoleGain = Butterworth(8),
            bpDesign    :: DSP.ZeroPoleGain = Butterworth(4),
            rate        :: Union{Real, Rational, Int} = 1,
            upperLimit  :: Union{Real, Int} = 0,
            classes     :: Union{Bool, Vector{String}} = true, 
            stdClass    :: Bool = true, 
            # Arguments passed to encode
            covtype = LShrLW,
            targetLabel :: String = "target",
            overlapping :: Bool = false,
            weights = :a,
            pcadim      :: Int = 8,
            standardize :: Bool = false,
            lags        :: Int = 0,
            tikh        :: Union{Real, Int} = 0,
            useBLAS     :: Bool = true,
            reg         :: Symbol = :rmt,
            tol         :: Real = 1e-6,
            maxiter     :: Int = 200,
            # Arguments passed to crval
            pipeline    :: Union{Pipeline, Nothing} = nothing,
            nFolds      :: Int = 8,
            seed        :: Int = 0,
            scoring     :: Symbol = :b,
            hypTest     :: Union{Symbol, Nothing} = :Bayle,
            outModels   :: Bool = false,
            fitArgs...)

Perform cross-validations of a BCI session stored in NY format.

This function runs in sequence the following three functions:

  1. Eegle.InOut.readNY
  2. encode (in this module)
  3. crval (in PosDefManifoldML.jl)

Arguments

  • filename: the complete path of either the .npz or the .yml file of the session to be used
  • model : any classifier of type MLmodel. Default: the default MDM classifier.
BCI paradigm

The BCI paradigm is assumed to be the one stored in the metadata of file filename (either :ERP, :P300, or :MI)

Optional Keyword Arguments

A reminder only is given here. For details, see the function each kwarg is passed to.

  • The following kwargs are passed to Eegle.InOut.readNY for reading and pre-processing the data:
    • toFloat64: conversion of data to Float64
    • bandStop, bandPass, bsDesign, bpDesign: filter settings
    • rate: resampling
    • upperLimit: artifact rejection
    • classes: classes of the trials to be read from the file. Default: all available classes
    • stdClass: standardization of class labels according to Eegle's conventions
  • the following kwargs are passed to encode to encode the trials as covariance matrices:
    • covtype: type of covariance matrix estimation
    • targetLabel: label of the target class (for the P300 paradigm only). Default: target
    • overlapping: type of mean target ERP estimator used as a prototype (ERP and P300 only)
    • weights: adaptive weighted mean target ERP estimation (ERP and P300 only)
    • pcadim: dimensionality reduction of the prototype by PCA (ERP and P300 only)
    • standardize: standardization the trials before estimating the covariance matrices
    • lags: lags embedding
    • tikh: Tikhonov regularization of the covariance matrices
    • useBLAS: use BLAS for computing the SCM covariance estimator
    • reg: , tol, maxiter, verbose: options for covariance M-Estimators.
  • the following kwargs are passed to crval:
    • pipeline: pre-conditioners for hastening the computations
    • nFolds: number of cross-validation stratified folds
    • scoring: performance index to be computed
    • seed: generation of the folds
    • hypTest: statistical test of the performance against the chance level
    • outModels: modulation of the output
    • fitArgs...: additional arguments handed to the fit function of the model.
  • the following are passed to both encode and crval:
    • verbose: print informations about some computations
    • threaded: run the functions in multithreaded mode (in crval it is named with unicode character ⏩).
`fitArgs...`

Function crval hands any additional kwargs to the fit function of the model. See crval for details. If you pass an invalid arguments, an error will be raised.

`classes`

It is good practice to inform the classes argument when comparing the performance obtained running this function across several different sessions. Only this ensures that the performance estimated on the same classes is considered in all sessions.

Return

If outModels is false (default), a CVres structure with the results of the cross-validation, otherwise a 2-tuple holding this CVres structure and a vector of the nFolds models fitted for each fold (of type MLmodel).

Examples

using Eegle

# Using the example files provided by Eegle.
# Averege accuracy is reported in square brackets

# a) P300 data: standard pipeline (MDM classifier)
crval(EXAMPLE_P300_1; bandPass = (1, 24)) # [0.711]

# a) with a random shuffling to generate folds
crval(EXAMPLE_P300_1; bandPass = (1, 24), seed = 1234) # [0.685]

## b) with artifact rejection
args = (bandPass = (1, 24), upperLimit = 1)
crval(EXAMPLE_P300_1; args...) # [0.723]

## b) using a 5-fold cross-validation (instead of 8-fold default)
crval(EXAMPLE_P300_1, MDM(); nFolds=5, args...) # [0.702]

## b) using the Log-Euclidean metric for the MDM classifier
crval(EXAMPLE_P300_1, MDM(logEuclidean); args...) # [0.663]

## b) with artifact rejection and pre-conditioning
pipeline = @→ Recenter(; eVar=0.999) → Compress → Shrink(Fisher)
crval(EXAMPLE_P300_1, MDM(Euclidean); pipeline, args...) # [0.719]

## b) using a Ridge logistic regression model in the tangent space (TS)
crval(EXAMPLE_P300_1, ENLR(; alpha = 0); args...) # [non-deterministic]

## b) using a LASSO logistic regression model in the TS
crval(EXAMPLE_P300_1, ENLR(); args...) # [non-deterministic]

## b) with Recentering and projecting the data onto the TS at the
## identity matrix (this avoids the computation of the barycenter)
crval(EXAMPLE_P300_1, ENLR(); meanISR=I, args...) # [non-deterministic]

# ====================================

# c) Motor Imagery data: standard pipeline (MDM classifier) with 
# artifact rejection, using classes "feet" and "right_hand"
args = (bandPass = (8, 32), upperLimit = 1, classes=["feet", "right_hand"])
crval(EXAMPLE_MI_1; args...) # [0.74]

## c) with a very fast pre-conditioning
pipeline = @→ Recenter → Equalize
crval(EXAMPLE_MI_1, MDM(Euclidean); pipeline, args...) # [0.729]

...
source