BCI.jl
This module implements machine learning for EEG data and, particularly, brain-computer interface data, using Riemannian geometry. Data encoding is achieved by estimating a form of covariance matrix for the EEG epochs or BCI trials.
For tutorials, see Machine Learning.
Resources for Covariance Matrices
| Package | Description |
|---|---|
| PosDefManifold.jl | low-level manipulation of covariance matrices using Riemannian geometry |
| Diagonalizations.jl | methods based on the diagonalization of one or more covariance matrices |
| FourierAnalysis.jl | Fourier cross-spectra and coherence matrices (special forms of covariance matrices) |
Methods
| Function | Description |
|---|---|
Eegle.BCI.covmat | many covariance matrix estimators (2 methods) |
Eegle.BCI.encode | encode all trials in a given EEG recording using Riemannian geometry |
Eegle.BCI.crval | perform a cross-validation of a Riemannian machine learning model on a BCI session |
📖
Eegle.BCI.covmat — Function
(1) function covmat(X::AbstractMatrix{T}, τ::Int = 0;
covtype = LShrLW,
prototype::Union{AbstractMatrix, Nothing} = nothing,
standardize::Bool = false,
lags::Int = 0,
useBLAS::Bool = true,
threaded::Bool = true,
reg::Symbol = :rmt,
tol::Real = real(T)(1e-6),
maxiter::Int = 200,
verbose::Bool = false)
where T<:Union{Real, Complex}
(2) function covmat(𝐗::AbstractVector{<:AbstractArray{T}};
< same arguments as method (1) > ...)Covariance matrix estimation(s) of:
- (1): a single data matrix (e.g., a trial)
X - (2): a vector of $K$ data matrices
𝐗.
Arguments
- (1)
X: $N×T$ real data matrix, where $N$ and $T$ denotes the number of samples and channels, respectively - (2)
𝐗: a vector holding $k$ such matrices.
** Arguments ** If τ is a positive number, the non-centered lagged (auto)covariance is computed. In this case only the prototype, standardize and threaded keyword arguments here below apply. The default τ is 0.
Optional Keyword Arguments
covtype: covariance estimator (default =LShrLW):SCM: non-centered sample covariance matrix (maximum likelihood)LShrLW: linear shrinkage estimator of (Ledoit and Wolf, 2004)NShrLW: non-linear shrinkage estimator of (Ledoit and Wolf, 2020):Tyler: Tyler's M-estimator (Tyler, 1987):nrTyler: normalized regularized Tyler's M-Estimator (Zhang and Wiesel, 2016)- or any estimator from the CovarianceEstimation.jl package.
prototype: optional matrix to be stacked to the data matrix (or matrices) to form a super-trial — see Appendix I in (Congedo et al., 2017). Default =nothingstandardize: if true, standardize the data matrix(ces) (global mean = 0 and global sd = 1 for each matrix) before estimating the covariance. Default =falselags: if > 0, embedlagslags by calling theEegle.Preprocessing.embedLagsfunction. No lags are embedded by default.useBLAS: optimize the SCM covariance computations using BLAS. Default =truethreaded: enable multi-threading across the $k$ matrices. For (2) only. Default =true- Only for M-estimators:
tol: the tolerance for the stopping criterion of the iterative algorithm (default =1e-6)maxiter: the maximum number of iterations allowed (default =200)verbose: if true, information about convergence is printed in the REPL (default =false).
- Only for the normalized regularized Tyler's M-Estimator. Default =
:rmt:reg: if it is:rmt, the random matrix theory shrinkage is used, otherwise the Ledoit and Wolf linear shrinkage is used.
Return
If τ > 0:
- (1): The covariance matrix estimation as a generic Julia
Matrixobject. - (2): A vector of $K$ covariance matrix estimations as a vector of generic Julia
Matrixobjects.
If τ = 0 (default):
- (1): The covariance matrix estimation as a Julia Hermitian matrix
- (2): A vector of $K$ covariance matrix estimations as an HermitianVector type.
Examples
using Eegle # or using Eegle.BCI
# Method (1)
C = covmat(randn(128, 19)) # linear shrinkage estimator by default
C = covmat(randn(128, 19); covtype=SCM)
C = covmat(randn(128, 19); covtype=:Tyler)
C = covmat(randn(128, 19), 1) # lagged autocovariance matrix
# not to be confused with the covariance matrix of lag-embedded data:
C = covmat(randn(128, 19); covtype=SCM, lags = 1)
# Suppose you want the SCM of the forward first-order differences.
# We can then use the standard `diff` Julia function:
C = covmat(diff(randn(128, 19); dims=1); covtype=SCM)
# et cetera.
# Method (2)
𝐗 = [randn(128, 19) for k=1:10]
𝐂 = covmat(𝐗)
C = covmat(𝐗; covtype=SCM)
C = covmat(𝐗; covtype=:Tyler)
## using an example file in NY format provided with Eegle:
## read a P300 BCI session, extract the trials and
## compute the covariance matrices of all using
## standard settings for P300 BCI data.
C = covmat(readNY(EXAMPLE_P300_1; bandPass=(1, 24), upperLimit=1.2).trials)Eegle.BCI.encode — Function
function encode(o::EEG;
paradigm::Symbol = o.paradigm,
covtype = LShrLW,
targetLabel::String = "target",
overlapping::Bool = false,
weights = :a,
pcadim::Int = 8,
standardize::Bool = false,
lags::Int = 0,
tikh :: Union{Real, Int} = 0,
useBLAS :: Bool = true,
threaded = true,
reg::Symbol = :rmt,
tol::Real = 1e-6,
maxiter::Int = 200,
verbose::Bool = false)Encode all trials in an EEG data structure as covariance matrices according to a given BCI paradigm. This is used in Riemannian geometry machine learning. The supported BCI paradigms are Motor Imagery (MI), Event-Related Potentials (ERP) and P300. For details, see Appendix I in (Congedo et al., 2017).
Arguments
o: an instance of theEEGdata structure containing trials and metadata.
Optional Keyword Arguments
paradigm: BCI paradigm, either:ERP,:P300, or:MI. By default it is used the paradigm stored ino.- for
:ERP, prototypes for all classes are stacked and covariance is computed on super-trials - for
:P300, only the target class prototype is stacked - for
:MI, no prototype is used; covariance is computed on the trial as it is.
- for
covtype,standardize,lags,useBLAS,reg,tol,maxiterandverbose— seeEegle.BCI.covmat, to which they are passed.targetLabel: label of the target class (for P300 paradigm only). By default is "target", following the conventions of the FII BCI corpus.overlapping: for prototype mean ERP estimations (ERP/P300 only). Default = false:- if true, use multivariate regression
- if false, use the arithmetic average — see
mean.
weights: weights for prototype mean ERP estimations (ERP/P300 only). Default =:a— seemeanpcadim: number of PCA components of the prototype. They replace the prototype (ERP/P300 paradigms only, default = 0, which does not apply PCA)standardize: standardize trials and prototype (global mean 0 and sd 1) before covariance estimation (default: false)tikh: Tikhonov regularization parameter (0, the default, does not apply regularization). It is applied after covariance estimationthreaded: enable multi-threaded covariance estimations across trials (default: true).
Return A vector of $k$ covariance matrix estimations as a HermitianVector type.
Examples
using Eegle # or using Eegle.BCI
# EXAMPLE_P300_1 is an example file provided by Eegle
o = readNY(EXAMPLE_P300_1; bandPass=(1, 24), upperLimit=1.2)
C = encode(o)PosDefManifoldML.crval — Function
function crval( filename :: AbstractString,
model :: MLmodel = MDM(Fisher);
# Arguments passed to both encode and crval
verbose :: Bool = true,
threaded :: Bool = true,
# Arguments passed to readNY
toFloat64 :: Bool = true,
bandStop :: Tuple = (),
bandPass :: Tuple = (),
bsDesign :: DSP.ZeroPoleGain = Butterworth(8),
bpDesign :: DSP.ZeroPoleGain = Butterworth(4),
rate :: Union{Real, Rational, Int} = 1,
upperLimit :: Union{Real, Int} = 0,
classes :: Union{Bool, Vector{String}} = true,
stdClass :: Bool = true,
# Arguments passed to encode
covtype = LShrLW,
targetLabel :: String = "target",
overlapping :: Bool = false,
weights = :a,
pcadim :: Int = 8,
standardize :: Bool = false,
lags :: Int = 0,
tikh :: Union{Real, Int} = 0,
useBLAS :: Bool = true,
reg :: Symbol = :rmt,
tol :: Real = 1e-6,
maxiter :: Int = 200,
# Arguments passed to crval
pipeline :: Union{Pipeline, Nothing} = nothing,
nFolds :: Int = 8,
seed :: Int = 0,
scoring :: Symbol = :b,
hypTest :: Union{Symbol, Nothing} = :Bayle,
outModels :: Bool = false,
fitArgs...)Perform cross-validations of a BCI session stored in NY format.
This function runs in sequence the following three functions:
Eegle.InOut.readNYencode(in this module)- crval (in PosDefManifoldML.jl)
Arguments
filename: the complete path of either the .npz or the .yml file of the session to be usedmodel: any classifier of type MLmodel. Default: the default MDM classifier.
The BCI paradigm is assumed to be the one stored in the metadata of file filename (either :ERP, :P300, or :MI)
Optional Keyword Arguments
A reminder only is given here. For details, see the function each kwarg is passed to.
- The following kwargs are passed to
Eegle.InOut.readNYfor reading and pre-processing the data:toFloat64: conversion of data toFloat64bandStop,bandPass,bsDesign,bpDesign: filter settingsrate: resamplingupperLimit: artifact rejectionclasses: classes of the trials to be read from the file. Default: all available classesstdClass: standardization of class labels according to Eegle's conventions
- the following kwargs are passed to
encodeto encode the trials as covariance matrices:covtype: type of covariance matrix estimationtargetLabel: label of the target class (for the P300 paradigm only). Default:targetoverlapping: type of mean target ERP estimator used as a prototype (ERP and P300 only)weights: adaptive weighted mean target ERP estimation (ERP and P300 only)pcadim: dimensionality reduction of the prototype by PCA (ERP and P300 only)standardize: standardization the trials before estimating the covariance matriceslags: lags embeddingtikh: Tikhonov regularization of the covariance matricesuseBLAS: use BLAS for computing the SCM covariance estimatorreg: ,tol,maxiter,verbose: options for covariance M-Estimators.
- the following kwargs are passed to crval:
pipeline: pre-conditioners for hastening the computationsnFolds: number of cross-validation stratified foldsscoring: performance index to be computedseed: generation of the foldshypTest: statistical test of the performance against the chance leveloutModels: modulation of the outputfitArgs...: additional arguments handed to thefitfunction of themodel.
- the following are passed to both
encodeandcrval:verbose: print informations about some computationsthreaded: run the functions in multithreaded mode (incrvalit is named with unicode character ⏩).
Function crval hands any additional kwargs to the fit function of the model. See crval for details. If you pass an invalid arguments, an error will be raised.
It is good practice to inform the classes argument when comparing the performance obtained running this function across several different sessions. Only this ensures that the performance estimated on the same classes is considered in all sessions.
Return
If outModels is false (default), a CVres structure with the results of the cross-validation, otherwise a 2-tuple holding this CVres structure and a vector of the nFolds models fitted for each fold (of type MLmodel).
Examples
using Eegle
# Using the example files provided by Eegle.
# Averege accuracy is reported in square brackets
# a) P300 data: standard pipeline (MDM classifier)
crval(EXAMPLE_P300_1; bandPass = (1, 24)) # [0.711]
# a) with a random shuffling to generate folds
crval(EXAMPLE_P300_1; bandPass = (1, 24), seed = 1234) # [0.685]
## b) with artifact rejection
args = (bandPass = (1, 24), upperLimit = 1)
crval(EXAMPLE_P300_1; args...) # [0.723]
## b) using a 5-fold cross-validation (instead of 8-fold default)
crval(EXAMPLE_P300_1, MDM(); nFolds=5, args...) # [0.702]
## b) using the Log-Euclidean metric for the MDM classifier
crval(EXAMPLE_P300_1, MDM(logEuclidean); args...) # [0.663]
## b) with artifact rejection and pre-conditioning
pipeline = @→ Recenter(; eVar=0.999) → Compress → Shrink(Fisher)
crval(EXAMPLE_P300_1, MDM(Euclidean); pipeline, args...) # [0.719]
## b) using a Ridge logistic regression model in the tangent space (TS)
crval(EXAMPLE_P300_1, ENLR(; alpha = 0); args...) # [non-deterministic]
## b) using a LASSO logistic regression model in the TS
crval(EXAMPLE_P300_1, ENLR(); args...) # [non-deterministic]
## b) with Recentering and projecting the data onto the TS at the
## identity matrix (this avoids the computation of the barycenter)
crval(EXAMPLE_P300_1, ENLR(); meanISR=I, args...) # [non-deterministic]
# ====================================
# c) Motor Imagery data: standard pipeline (MDM classifier) with
# artifact rejection, using classes "feet" and "right_hand"
args = (bandPass = (8, 32), upperLimit = 1, classes=["feet", "right_hand"])
crval(EXAMPLE_MI_1; args...) # [0.74]
## c) with a very fast pre-conditioning
pipeline = @→ Recenter → Equalize
crval(EXAMPLE_MI_1, MDM(Euclidean); pipeline, args...) # [0.729]
...