cv.jl

This unit implements cross-validation procedures for estimating the accuracy and balanced accuracy of machine learning models. It also reports the documentation of the fit and predict functions, as they are common to all models.

Content

structdescription
CVaccencapsulate the result of cross-validation procedures for estimating accuracy
functiondescription
fitfit a model with training data, or create and fit it
predictpreidct labels, probabilities or scoring functions on test data
cvAccestimate accuracy of a model by cross-validation
cvSetupgenerate indexes for performing cross-validtions
PosDefManifoldML.CVaccType
struct CVacc
    cvType    :: String
    scoring   :: Union{String, Nothing}
    modelType :: Union{String, Nothing}
    cnfs      :: Union{Vector{Matrix{T}}, Nothing} where T<:Real
    avgCnf    :: Union{Matrix{T}, Nothing} where T<:Real
    accs      :: Union{Vector{T}, Nothing} where T<:Real
    avgAcc    :: Union{Real, Nothing}
    stdAcc    :: Union{Real, Nothing}
end

A call to cvAcc results in an instance of this structure. Fields:

.cvTpe is the type of cross-validation technique, given as a string (e.g., "10-kfold")

.scoring is the type of accuracy that is computed, given as a string. This has been passed as argument to cvAcc. Currently accuracy and balanced accuracy are supported.

.modelType is type of the machine learning used for performing the cross-validation, given as a string.

.cnfs is a vector of matrices holding the confusion matrices obtained at each fold of the cross-validation.

.avgCnf is the average confusion matrix across the folds of the cross-validation.

.accs is a vector of real numbers holding the accuracies obtained at each fold of the cross-validation.

.avgAcc is the average accuracy across the folds of the cross-validation.

.stdAcc is the standard deviation of the accuracy across the folds of the cross-validation.

source
StatsBase.fitFunction
function fit(model :: MDMmodel,
              𝐏Tr   :: ℍVector,
              yTr   :: IntVector;
       w        :: Vector = [],
       ✓w       :: Bool  = true,
       meanInit :: Union{ℍVector, Nothing} = nothing,
       tol      :: Real  = 1e-5,
       verbose  :: Bool  = true,
       ⏩       :: Bool  = true)

Fit an MDM machine learning model, with training data 𝐏Tr, of type ℍVector, and corresponding labels yTr, of type IntVector. Return the fitted model.

Labels must be provided using the natural numbers, i.e., 1 for the first class, 2 for the second class, etc.

Fitting an MDM model involves only computing a mean of all the matrices in each class. Those class means are computed according to the metric specified by the MDM constructor.

Optional keyword argument w is a vector of non-negative weights associated with the matrices in 𝐏Tr. This weights are used to compute the mean for each class. See method (3) of the mean function for the meaning of the arguments w, ✓w and , to which they are passed. Keep in mind that here the weights should sum up to 1 separatedly for each class, which is what is ensured by this function if ✓w is true.

Optional keyword argument tol is the tolerance required for those algorithms that compute the mean iteratively (they are those adopting the Fisher, logde0 or Wasserstein metric). It defaults to 1e-5. For details on this argument see the functions that are called for computing the means:

For those algorithm an initialization can be provided with optional keyword argument meanInit. If provided, this must be a vector of Hermitian matrices of the ℍVector type and must contain as many initializations as classes, in the natural order corresponding to the class labels (see above).

If verbose is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.

See: notation & nomenclature, the ℍVector type.

See also: predict, cvAcc.

Examples

using PosDefManifoldML, PosDefManifold

# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80, 0.25)

# create and fit a model:
m=fit(MDM(Fisher), PTr, yTr)
source
function fit(model	:: ENLRmodel,
             𝐏Tr	 :: Union{ℍVector, Matrix{Float64}},
             yTr	:: IntVector;
	# parameters for projection onto the tangent space
	w		:: Union{Symbol, Tuple, Vector} = [],
	meanISR	:: Union{ℍ, Nothing} = nothing,
	meanInit:: Union{ℍ, Nothing} = nothing,
	vecRange:: UnitRange = 𝐏Tr isa ℍVector ? (1:size(𝐏Tr[1], 2)) : (1:size(𝐏Tr, 2)),
	fitType	:: Symbol = :best,
	verbose	:: Bool = true,
	⏩	   :: Bool = true,
	# arguments for `GLMNet.glmnet` function
	alpha			:: Real = model.alpha,
	weights			:: Vector{Float64} = ones(Float64, length(yTr)),
	intercept		:: Bool = true,
	standardize		:: Bool = true,
	penalty_factor	:: Vector{Float64} = ones(Float64, _getDim(𝐏Tr, vecRange)),
	constraints		:: Matrix{Float64} = [x for x in (-Inf, Inf), y in 1:_getDim(𝐏Tr, vecRange)],
	offsets			:: Union{Vector{Float64}, Nothing} = nothing,
	dfmax			:: Int = _getDim(𝐏Tr, vecRange),
	pmax			:: Int = min(dfmax*2+20, _getDim(𝐏Tr, vecRange)),
	nlambda			:: Int = 100,
	lambda_min_ratio:: Real = (length(yTr)*2 < _getDim(𝐏Tr, vecRange) ? 1e-2 : 1e-4),
	lambda			:: Vector{Float64} = Float64[],
	tol				:: Real = 1e-5,
	maxit			:: Int = 1000000,
	algorithm		:: Symbol = :newtonraphson,
	# selection method
	λSelMeth	:: Symbol = :sd1,
	# arguments for `GLMNet.glmnetcv` function
	nfolds		:: Int = min(10, div(size(yTr, 1), 3)),
	folds		:: Vector{Int} =
	begin
		n, r = divrem(size(yTr, 1), nfolds)
		shuffle!([repeat(1:nfolds, outer=n); 1:r])
	end,
	parallel 	:: Bool=true)

Create and fit an ENLR machine learning model, with training data 𝐏Tr, of type ℍVector, and corresponding labels yTr, of type IntVector. Return the fitted model(s) as an instance of the ENLR structure.

As for all ML models acting in the tangent space, fitting an ENLR model involves computing a mean of all the matrices in 𝐏Tr, mapping all matrices onto the tangent space after parallel transporting them at the identity matrix and vectorizing them using the vecP operation. Once this is done, the elastic net logistic regression is fitted.

The mean is computed according to the .metric field of the model, with optional weights w. The .metric field of the model is passed to the tsMap function. By default the metric is the Fisher metric. See the examples here below to see how to change metric. See mdm.jl or check out directly the documentation of PosDefManifold.jl for the available metrics.

Optional keyword arguments

By default, uniform weights will be given to all observations for computing the mean to pass in the tangent space. This is equivalent to passing as argument w=:uniform (or w=:u). You can also pass as argument:

  • w=:balanced (or simply w=:b). If the two classes are unbalanced, the weights should be inversely proportional to the number of examples for each class, in such a way that each class contributes equally to the computation of the mean. This is equivalent of passing w=tsWeights(yTr). See the tsWeights function for details.
  • w=v, where v is a user defined vector of non-negative weights for the observations, thus, v must contain the same number of elements as yTr. For example, w=[1.0, 1.0, 2.0, 2.0, ...., 1.0]
  • w=t, where t is a 2-tuple of real weights, one weight for each class, for example w=(0.5, 1.5). This is equivalent to passing w=tsWeights(yTr; classWeights=collect(t)), see the tsWeights function for details.

If meanISR is passed as argument, the mean is not computed, instead this matrix is the inverse square root (ISR) of the mean used for projecting the matrices in the tangent space (see tsMap). Passed or computed, the inverse square root (ISR) of the mean will be written in the .meanISR field of the created ENLR structure. If meanISRis is not provided and the .metric field of the model is Fisher, logdet0 or Wasserstein, the tolerance of the iterative algorithm used to compute the mean is set to the argument passed as tol (default 1e-5). Also, in this case a particular initialization for those iterative algorithms can be provided as an Hermitian matrix with argument meanInit.

This function also allows to fit a model passing as training data 𝐏Tr directly a matrix of feature vectors, where each feature vector is a row of the matrix. In this case the metric of the ENLR model and argument meanISR are not used. Therefore, the .meanISR field of the created ENLR structure will be set to nothing.

If a UnitRange is passed with optional keyword argument vecRange, then if 𝐏Tr is a vector of Hermitian matrices, the vectorization of those matrices once they are projected onto the tangent space concerns only the rows (or columns) given in the specified range, else if 𝐏Tr is a matrix with feature vectors arranged in its rows, then only the columns of 𝐏Tr given in the specified range will be used.

If fitType = :best (default), a cross-validation procedure is run to find the best lambda hyperparameter for the given training data. This finds a single model that is written into the .best field of the ENLR structure that will be created.

If fitType = :path, the regularization path for several values of the lambda hyperparameter is found for the given training data. This creates several models, which are written into the .path field of the ENLR structure that will be created, none of which is optimal, in the cross-validation sense, for the given training data.

If fitType = :all, both the above fits are performed and all fields of the ENLR structure that will be created will be filled in.

If verbose is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.

The argument (true by default) is passed to the tsMap function for projecting the matrices in 𝐏Tr onto the tangent space and to the GLMNet.glmnetcv function to run inner cross-validation to find the best model using multi-threading.

The remaining optional keyword arguments, are

  • the arguments passed to the GLMNet.glmnet function for fitting the models. Those are always used.

  • the λSelMeth argument and the arguments passed to the GLMNet.glmnetcv function for finding the best lambda hyperparamater by cross-validation. Those are used only if fitType = :path or = :all.

Optional keyword arguments for fitting the model(s) using GLMNet

alpha: the hyperparameter in [0, 1] to trade-off an elestic-net model. α=0 requests a pure ridge model and α=1 a pure lasso model. This defaults to 1.0, which specifies a lasso model, unless the input ENLR model has another value in the alpha field, in which case this value is used. If argument alpha is passed here, it will overwrite the alpha field of the input model.

weights: a vector of weights for each matrix (or feature vectors) of the same size as yTr. It defaults to 1 for all matrices.

intercept: whether to fit an intercept term. The intercept is always unpenalized. Defaults to true.

standardize: if true (default), GLMNet standardize the predictors (presumably this amounts to transform to unit variance) so that they are in the same units. This is a common choice for regularized regression models.

penalty_factor: a vector of length n(n+1)/2, where n is the dimension of the original PD matrices on which the model is applied, of penalties for each predictor in the tangent vectors. This defaults to all ones, which weights each predictor equally. To specify that a predictor should be unpenalized, set the corresponding entry to zero.

constraints: an [n(n+1)/2] x 2 matrix specifying lower bounds (first column) and upper bounds (second column) on each predictor. By default, this is [-Inf Inf] for each predictor (each element of tangent vectors).

offset: see documentation of original GLMNet package 🎓.

dfmax: The maximum number of predictors in the largest model.

pmax: The maximum number of predictors in any model.

nlambda: The number of values of λ along the path to consider.

lambda_min_ratio: The smallest λ value to consider, as a ratio of the value of λ that gives the null model (i.e., the model with only an intercept). If the number of observations exceeds the number of variables, this defaults to 0.0001, otherwise 0.01.

lambda: The λ values to consider for fitting. By default, this is determined from nlambda and lambda_min_ratio.

tol: Is the convergence criterion for both the computation of a mean for projecting onto the tangent space (if the metric requires an iterative algorithm) and for the GLMNet fitting algorithm. Defaults to 1e-5. In order to speed up computations, you may try to set a lower tol; The convergence will be faster but more coarse, with a possible drop of classification accuracy, depending on the signal-to-noise ratio of the input features.

maxit: The maximum number of iterations of the cyclic coordinate descent algorithm. If convergence is not achieved, a warning is returned.

algorithm: the algorithm used to find the regularization path. Possible values are :newtonraphson (default) and :modifiednewtonraphson.

For further informations on those arguments, refer to the resources on the GLMNet package 🎓.

Optional Keyword arguments for finding the best model by cv

λSelMeth = :sd1 (default), the best model is defined as the one allowing the highest cvλ.meanloss within one standard deviation of the minimum, otherwise it is defined as the one allowing the minimum cvλ.meanloss. Note that in selecting a model, the model with only the intercept term, if it exists, is ignored. See ENLRmodel for a description of the .cvλ field of the model structure.

Arguments nfolds and folds are passed to the GLMNet.glmnetcv function along with the argument. Please refer to the resources on GLMNet for details 🎓.

See: notation & nomenclature, the ℍVector type.

See also: predict, cvAcc.

Tutorial: Example using the ENLR model.

Examples

using PosDefManifoldML, PosDefManifold

# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80, 0.1)

# Fit an ENLR lasso model and find the best model by cross-validation:
m=fit(ENLR(), PTr, yTr)

# ... balancing the weights for tangent space mapping
m=fit(ENLR(), PTr, yTr; w=tsWeights(yTr))

# ... using the log-Eucidean metric for tangent space projection
m=fit(ENLR(logEuclidean), PTr, yTr)

# Fit an ENLR ridge model and find the best model by cv:
m=fit(ENLR(Fisher), PTr, yTr; alpha=0)

# Fit an ENLR elastic-net model (α=0.9) and find the best model by cv:
m=fit(ENLR(Fisher), PTr, yTr; alpha=0.9)

# Fit an ENLR lasso model and its regularization path:
m=fit(ENLR(), PTr, yTr; fitType=:path)

# Fit an ENLR lasso model, its regularization path
# and the best model found by cv:
m=fit(ENLR(), PTr, yTr; fitType=:all)
source
function fit(model     :: SVMmodel,
               𝐏Tr     :: Union{ℍVector, Matrix{Float64}},
               yTr     :: IntVector=[];
	# parameters for projection onto the tangent space
	w :: Union{Symbol, Tuple, Vector} = [],
	meanISR :: Union{ℍ, Nothing} = nothing,
	meanInit :: Union{ℍ, Nothing} = nothing,
	vecRange	:: UnitRange = 𝐏Tr isa ℍVector ? (1:size(𝐏Tr[1], 2)) : (1:size(𝐏Tr, 2)),
	# SVM parameters
	svmType :: Type = SVC,
	kernel :: Kernel.KERNEL = RadialBasis,
	epsilon :: Float64 = 0.1,
	cost	:: Float64 = 1.0,
	gamma :: Float64	= 1/_getDim(𝐏Tr, vecRange),
	degree :: Int64	= 3,
	coef0 :: Float64	= 0.,
	nu :: Float64 = 0.5,
	shrinking :: Bool = true,
	probability :: Bool = false,
	weights :: Union{Dict{Int, Float64}, Nothing} = nothing,
	cachesize :: Float64	= 200.0,
	# Generic and common parameters
	tol :: Real = 1e-5,
	rescale :: Tuple	= (-1, 1),
	verbose :: Bool = true,
	⏩ :: Bool = true)

Create and fit an SVM machine learning model, with training data 𝐏Tr, of type ℍVector, and corresponding labels yTr, of type IntVector. The label vector can be omitted if the svmType is OneClassSVM (see SVM). Return the fitted model as an instance of the SVM structure.

As for all ML models acting in the tangent space, fitting an SVM model involves computing a mean of all the matrices in 𝐏Tr, mapping all matrices onto the tangent space after parallel transporting them at the identity matrix and vectorizing them using the vecP operation. Once this is done, the support-vector machine is fitted.

Arguments w, meanISR, meanInit and vecRange allow to tune the projection onto the tangent space. See the documentation of the fit function for the ENLR model here above for their meaning.

svmType and kernel allow to chose among several available SVM models. See the documentation of the SVM structure.

epsilon, with default 0.1, is the epsilon in loss function of the epsilonSVR SVM model.

cost, with default 1.0, is the cost parameter C of SVC, epsilonSVR, and nuSVR SVM models.

gamma, defaulting to 1 divided by the length of the feature vectors, is the γ parameter for RadialBasis, Polynomial and Sigmoid kernels.

degree, with default 3, is the degree for Polynomial kernels

coef0, zero by default, is a parameter for the Sigmoid and Polynomial kernel.

nu, with default 0.5, is the parameter ν of nuSVC, OneClassSVM, and nuSVR SVM models. It should be in the interval (0, 1].

shrinking, true by default, sets whether to use the shrinking heuristics.

probability, false by default sets whether to train a SVC or SVR model allowing probability estimates.

if a Dict{Int, Float64} is passed as weights argument, it will be used to give weights to the classes. By default it is equal to nothing, implying equal weights to all classes.

cachesize for the kernel, 200.0 by defaut (in MB), can be increased for very large problems.

tol is the convergence criterion for both the computation of a mean for projecting onto the tangent space (if the metric recquires an iterative algorithm) and for the LIBSVM fitting algorithm. Defaults to 1e-5.

rescale is a 2-tuple of the lower and upper limit to rescale the feature vectors within these limits. The default is (-1, 1), since tangent vectors of PD matrices have positive and negative elements. If 𝐏Tr is a feature matrix and the features are only positive, use (0, 1) instead. In order not to rescale the feature vectors, use ().

If verbose is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL. It may not work properly in a multithreaded context (see argument here below).

The argument (true by default) is passed to the tsMap function for projecting the matrices in 𝐏Tr onto the tangent space and to the LIBSVM function that perform the fit in order to run them in multi-threaded mode.

For further information on tho LIBSVM arguments, refer to the resources on the LIBSVM package 🎓.

See: notation & nomenclature, the ℍVector type.

See also: predict, cvAcc.

Tutorial: Example using SVM models.

Examples

using PosDefManifoldML, PosDefManifold

# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80, 0.1);

# Fit an SVC SVM model and find the best model by cross-validation:
m=fit(SVM(), PTr, yTr)

# ... balancing the weights for tangent space mapping
m=fit(SVM(), PTr, yTr; w=:b)

# ... using the log-Eucidean metric for tangent space projection
m=fit(SVM(logEuclidean), PTr, yTr)

# ... using the linear kernel
m=fit(SVM(logEuclidean), PTr, yTr, kernel=Linear)

# or

m=fit(SVM(logEuclidean; kernel=Linear), PTr, yTr)

# ... using the Nu-Support Vector Classification
m=fit(SVM(logEuclidean), PTr, yTr, kernel=Linear, svmtype=NuSVC)

# or

m=fit(SVM(logEuclidean; kernel=Linear, svmtype=NuSVC), PTr, yTr)

# N.B. all other keyword arguments must be passed to the fit function
# and not to the SVM constructor.

source
StatsBase.predictFunction
function predict(model  :: MDMmodel,
                 𝐏Te    :: ℍVector,
                 what   :: Symbol = :labels;
        verbose :: Bool = true,
        ⏩     :: Bool = true)

Given an MDM model trained (fitted) on z classes and a testing set of k positive definite matrices 𝐏Te of type ℍVector,

if what is :labels or :l (default), return the predicted class labels for each matrix in 𝐏Te, as an IntVector. For MDM models, the predicted class 'label' of an unlabeled matrix is the serial number of the class whose mean is the closest to the matrix (minimum distance to mean). The labels are '1' for class 1, '2' for class 2, etc;

if what is :probabilities or :p, return the predicted probabilities for each matrix in 𝐏Te to belong to a all classes, as a k-vector of z vectors holding reals in [0, 1]m (probabilities). The 'probabilities' are obtained passing to a softmax function minus the squared distances of each unlabeled matrix to all class means;

if what is :f or :functions, return the output function of the model. The ratio of the squared distance to all classes to their geometric mean gives the 'functions'.

If verbose is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.

It f is true (default), the computation of distances is multi-threaded.

See: notation & nomenclature, the ℍVector type.

See also: fit, cvAcc, predictErr.

Examples

using PosDefManifoldML, PosDefManifold

# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80)

# craete and fit an MDM model
m=fit(MDM(Fisher), PTr, yTr)

# predict labels
yPred=predict(m, PTe, :l)

# prediction error
predErr=predictErr(yTe, yPred)

# predict probabilities
predict(m, PTe, :p)

# output functions
predict(m, PTe, :f)
source
function predict(model   :: ENLRmodel,
		𝐏Te		:: Union{ℍVector, Matrix{Float64}},
		what		:: Symbol = :labels,
		fitType		:: Symbol = :best,
		onWhich		:: Int    = Int(fitType==:best);
	transfer	:: Union{ℍ, Nothing} = nothing,
	verbose		:: Bool = true,
	⏩		:: Bool = true)

Given an ENLR model trained (fitted) on 2 classes and a testing set of k positive definite matrices 𝐏Te of type ℍVector,

if what is :labels or :l (default), return the predicted class labels for each matrix in 𝐏Te, as an IntVector. Those labels are '1' for class 1 and '2' for class 2;

if what is :probabilities or :p, return the predicted probabilities for each matrix in 𝐏Te to belong to each classe, as a k-vector of z vectors holding reals in [0, 1] (probabilities). The 'probabilities' are obtained passing to a softmax function the output of the ENLR model and zero;

if what is :f or :functions, return the output function of the model, which is the raw output of the ENLR model.

If fitType = :best (default), the best model that has been found by cross-validation is used for prediction.

If fitType = :path,

  • if onWhich is a valid serial number for a model in the model.path,

then this model is used for prediction,

  • if onWhich is zero, all model in the model.path will be used for

predictions, thus the output will be multiplied by the number of models in model.path.

Argumet onWhich has no effect if fitType = :best.

Nota Bene

By default, the fit function fits only the best model. If you want to use the fitType = :path option you need to invoke the fit function with optional keyword argument fitType=:path or fitType=:all. See the fit function for details.

Optional keyword argument transfer can be used to specify the principal inverse square root (ISR) of a new mean to be used as base point for projecting the matrices in 𝐏Te onto the tangent space. By default transfer is equal to nothing, implying that the base point will be the mean used to fit the model. Passing a new mean ISR allows the adaptation first described in Barachant et al.(2013). Typically transfer is the ISR of the mean of the matrices in 𝐏Te or of a subset of them. Notice that this actually performs transfer learning by parallel transporting both the training and test data to the identity matrix as defined in Zanini et al.(2018) and later taken up in Rodrigues et al.(2019)🎓.

If verbose is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.

If ⏩ = true (default) and 𝐏Te is an ℍVector type, the projection onto the tangent space is multi-threaded.

See: notation & nomenclature, the ℍVector type.

See also: fit, cvAcc, predictErr.

Examples

using PosDefManifoldML, PosDefManifold

# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80)

# fit an ENLR lasso model and find the best model by cv
m=fit(ENLR(Fisher), PTr, yTr)

# predict labels from the best model
yPred=predict(m, PTe, :l)
# prediction error
predErr=predictErr(yTe, yPred)

# predict probabilities from the best model
predict(m, PTe, :p)

# output functions from the best model
predict(m, PTe, :f)

# fit a regularization path for an ENLR lasso model
m=fit(ENLR(Fisher), PTr, yTr; fitType=:path)

# predict labels using a specific model
yPred=predict(m, PTe, :l, :path, 10)

# predict labels for all models
yPred=predict(m, PTe, :l, :path, 0)
# prediction error for all models
predErr=[predictErr(yTe, yPred[:, i]) for i=1:size(yPred, 2)]

# predict probabilities from a specific model
predict(m, PTe, :p, :path, 12)

# predict probabilities from all models
predict(m, PTe, :p, :path, 0)

# output functions from specific model
predict(m, PTe, :f, :path, 3)

# output functions for all models
predict(m, PTe, :f, :path, 0)
source
function predict(model :: SVMmodel,
				𝐏Te	:: Union{ℍVector, Matrix{Float64}},
				what	:: Symbol = :labels;
	transfer :: Union{ℍ, Nothing} = nothing,
	verbose	 :: Bool = true,
	⏩		:: Bool = true)

Given an SVM model trained (fitted) on 2 classes and a testing set of k positive definite matrices 𝐏Te of type ℍVector,

For the meaning of arguments what, transfer and verbose, see the documentation of the predict function for the ENLR model.

If ⏩ = true (default) and 𝐏Te is an ℍVector type, the projection onto the tangent space will be multi-threaded. Also, the prediction of the LIBSVM function will be multi-threaded.

See: notation & nomenclature, the ℍVector type.

See also: fit, cvAcc, predictErr.

Examples

using PosDefManifoldML, PosDefManifold

# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80)

# fit an SVM model
m=fit(SVM(Fisher), PTr, yTr)

# predict labels
yPred=predict(m, PTe, :l)
# prediction error
predErr=predictErr(yTe, yPred)

# predict probabilities
predict(m, PTe, :p)

# output functions
predict(m, PTe, :f)
source
PosDefManifoldML.cvAccFunction
function cvAcc(model    :: MLmodel,
               𝐏Tr     :: ℍVector,
               yTr      :: IntVector;
           nFolds       :: Int      = min(10, length(yTr)÷3),
           scoring      :: Symbol   = :b,
           shuffle      :: Bool     = false,
           verbose      :: Bool     = true,
           outModels    :: Bool     = false,
           ⏩           :: Bool     = true,
           fitArgs...)

Cross-validation accuracy for a machine learning model: given an ℍVector 𝐏Tr holding k Hermitian matrices, an IntVector yTr holding the k labels for these matrices and the number of folds nFolds, return a CVacc structure.

optional keyword arguments

nFolds by default is set to the minimum between 10 and the number of observation ÷ 3 (integer division).

If scoring=:b (default) the balanced accuracy is computed. Any other value will make the function returning the regular accuracy. Balanced accuracy is to be preferred for unbalanced classes. For balanced classes the balanced accuracy reduces to the regular accuracy, therefore there is no point in using regular accuracy if not to avoid a few unnecessary computations when the class are balanced.

For the meaning of the shuffle argument (false by default), see function cvSetup, to which this argument is passed.

If verbose is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.

If outModels is true return a 2-tuple holding a CVacc structure and a nFolds-vector of the model fitted for each fold, otherwise (default), return only a CVacc structure.

If the folds and some other computations are multi-threaded. It is true by default. Set it to false if there are problems in running this function.

fitArgs are optional keyword arguments that are passed to the fit function called for each fold of the cross-validation. For each machine learning model, all optional keyword arguments of their fit method are elegible to be passed here, however, the arguments listed in the following table for each model should not be passed. Note that if they are passed, they will be disabled:

MDM/MDMFENLRSVM
verboseverboseverbose
meanISRmeanISR
meanInitmeanInit
fitType
offsets
lambda
folds

Also, if you pass a w (weights for tangent space projection) argument, do not pass a vector of weights, just pass a symbol, e.g., w=:b for balancing weights.

See: notation & nomenclature, the ℍVector type.

See also: fit, predict.

Examples

using PosDefManifoldML, PosDefManifold

# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80)

# perform 10-fold cross-validation using the minimum distance to mean classifier
cv=cvAcc(MDM(Fisher), PTr, yTr)

# ...using the lasso logistic regression classifier
cv=cvAcc(ENLR(Fisher), PTr, yTr)

# ...using the support-vector machine classifier
cv=cvAcc(SVM(Fisher), PTr, yTr)

# ...With a Polynomial kernel of order 3 (default)
cv=cvAcc(SVM(Fisher), PTr, yTr; kernel=kernel.Polynomial)

# perform 8-fold cross-validation instead
# (and see that you can go pretty fast if your PC has 8 threads)
cv=cvAcc(SVM(Fisher), PTr, yTr; nFolds=8)

# ...balance the weights for tangent space projection
cv=cvAcc(ENLR(Fisher), PTr, yTr; nFolds=8, w=:b)

# perform another cross-validation shuffling the folds
cv=cvAcc(ENLR(Fisher), PTr, yTr; shuffle=true, nFolds=8, w=:b)
source
PosDefManifoldML.cvSetupFunction
function cvSetup(k       :: Int,
                 nCV     :: Int;
                 shuffle :: Bool = false)

Given k elements and a parameter nCV, a nCV-fold cross-validation is obtained defining nCV permutations of k elements in nTest=k÷nCV (integer division) elements for the test and k-nTest elements for the training, in such a way that each element is represented in only one permutation.

Said differently, given a length k and the number of desired cross-validations nCV, this function generates indices from the sequence of natural numbers 1,..,k to obtain all nCV-fold cross-validation sets. Specifically, it generates nCV vectors of indices for generating test sets and nCV vectors of indices for geerating training sets.

If optional keyword argument shuffle is true, the sequence of natural numbers 1,..,k is shuffled before running the function, thus in this case two successive runs of this function will give different cross-validation sets, hence different accuracy scores. By default shuffle is false, so as to allow exactly the same result in successive runs. Note that no random initialization for the shuffling is provided, so as to allow the replication of the same random sequences starting again the random generation from scratch.

This function is used in cvAcc. It constitutes the fundamental basis to implement customized cross-validation procedures.

Return the 2-tuple with:

  • A vector of nCV vectors holding the indices for the training sets,
  • A vector of nCV vectors holding the indices for the corresponding test sets.

Examples

using PosDefManifoldML, PosDefManifold

cvSetup(10, 2)
# return:
# (Array{Int64,1}[[6, 7, 8, 9, 10], [1, 2, 3, 4, 5]],
#  Array{Int64,1}[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

cvSetup(10, 2, shuffle=true)
# return:
# (Array{Int64,1}[[5, 4, 6, 1, 9], [3, 7, 8, 2, 10]],
#  Array{Int64,1}[[3, 7, 8, 2, 10], [5, 4, 6, 1, 9]])

cvSetup(10, 3)
# return:
# (Array{Int64,1}[[4, 5, 6, 7, 8, 9, 10], [1, 2, 3, 7, 8, 9, 10], [1, 2, 3, 4, 5, 6]],
#  Array{Int64,1}[[1, 2, 3], [4, 5, 6], [7, 8, 9, 10]])
source