cv.jl
This unit implements cross-validation procedures for estimating the accuracy and balanced accuracy of machine learning models. It also reports the documentation of the fit and predict functions, as they are common to all models.
Content
struct | description |
---|---|
CVacc | encapsulate the result of cross-validation procedures for estimating accuracy |
function | description |
---|---|
fit | fit a model with training data, or create and fit it |
predict | preidct labels, probabilities or scoring functions on test data |
cvAcc | estimate accuracy of a model by cross-validation |
cvSetup | generate indexes for performing cross-validtions |
PosDefManifoldML.CVacc
— Typestruct CVacc
cvType :: String
scoring :: Union{String, Nothing}
modelType :: Union{String, Nothing}
cnfs :: Union{Vector{Matrix{T}}, Nothing} where T<:Real
avgCnf :: Union{Matrix{T}, Nothing} where T<:Real
accs :: Union{Vector{T}, Nothing} where T<:Real
avgAcc :: Union{Real, Nothing}
stdAcc :: Union{Real, Nothing}
end
A call to cvAcc
results in an instance of this structure. Fields:
.cvTpe
is the type of cross-validation technique, given as a string (e.g., "10-kfold")
.scoring
is the type of accuracy that is computed, given as a string. This has been passed as argument to cvAcc
. Currently accuracy and balanced accuracy are supported.
.modelType
is type of the machine learning used for performing the cross-validation, given as a string.
.cnfs
is a vector of matrices holding the confusion matrices obtained at each fold of the cross-validation.
.avgCnf
is the average confusion matrix across the folds of the cross-validation.
.accs
is a vector of real numbers holding the accuracies obtained at each fold of the cross-validation.
.avgAcc
is the average accuracy across the folds of the cross-validation.
.stdAcc
is the standard deviation of the accuracy across the folds of the cross-validation.
StatsBase.fit
— Functionfunction fit(model :: MDMmodel,
𝐏Tr :: ℍVector,
yTr :: IntVector;
w :: Vector = [],
✓w :: Bool = true,
meanInit :: Union{ℍVector, Nothing} = nothing,
tol :: Real = 1e-5,
verbose :: Bool = true,
⏩ :: Bool = true)
Fit an MDM
machine learning model, with training data 𝐏Tr
, of type ℍVector, and corresponding labels yTr
, of type IntVector. Return the fitted model.
Labels must be provided using the natural numbers, i.e., 1
for the first class, 2
for the second class, etc.
Fitting an MDM model involves only computing a mean of all the matrices in each class. Those class means are computed according to the metric specified by the MDM
constructor.
Optional keyword argument w
is a vector of non-negative weights associated with the matrices in 𝐏Tr
. This weights are used to compute the mean for each class. See method (3) of the mean function for the meaning of the arguments w
, ✓w
and ⏩
, to which they are passed. Keep in mind that here the weights should sum up to 1 separatedly for each class, which is what is ensured by this function if ✓w
is true.
Optional keyword argument tol
is the tolerance required for those algorithms that compute the mean iteratively (they are those adopting the Fisher, logde0 or Wasserstein metric). It defaults to 1e-5. For details on this argument see the functions that are called for computing the means:
For those algorithm an initialization can be provided with optional keyword argument meanInit
. If provided, this must be a vector of Hermitian
matrices of the ℍVector type and must contain as many initializations as classes, in the natural order corresponding to the class labels (see above).
If verbose
is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.
See: notation & nomenclature, the ℍVector type.
Examples
using PosDefManifoldML, PosDefManifold
# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80, 0.25)
# create and fit a model:
m=fit(MDM(Fisher), PTr, yTr)
function fit(model :: ENLRmodel,
𝐏Tr :: Union{ℍVector, Matrix{Float64}},
yTr :: IntVector;
# parameters for projection onto the tangent space
w :: Union{Symbol, Tuple, Vector} = [],
meanISR :: Union{ℍ, Nothing} = nothing,
meanInit:: Union{ℍ, Nothing} = nothing,
vecRange:: UnitRange = 𝐏Tr isa ℍVector ? (1:size(𝐏Tr[1], 2)) : (1:size(𝐏Tr, 2)),
fitType :: Symbol = :best,
verbose :: Bool = true,
⏩ :: Bool = true,
# arguments for `GLMNet.glmnet` function
alpha :: Real = model.alpha,
weights :: Vector{Float64} = ones(Float64, length(yTr)),
intercept :: Bool = true,
standardize :: Bool = true,
penalty_factor :: Vector{Float64} = ones(Float64, _getDim(𝐏Tr, vecRange)),
constraints :: Matrix{Float64} = [x for x in (-Inf, Inf), y in 1:_getDim(𝐏Tr, vecRange)],
offsets :: Union{Vector{Float64}, Nothing} = nothing,
dfmax :: Int = _getDim(𝐏Tr, vecRange),
pmax :: Int = min(dfmax*2+20, _getDim(𝐏Tr, vecRange)),
nlambda :: Int = 100,
lambda_min_ratio:: Real = (length(yTr)*2 < _getDim(𝐏Tr, vecRange) ? 1e-2 : 1e-4),
lambda :: Vector{Float64} = Float64[],
tol :: Real = 1e-5,
maxit :: Int = 1000000,
algorithm :: Symbol = :newtonraphson,
# selection method
λSelMeth :: Symbol = :sd1,
# arguments for `GLMNet.glmnetcv` function
nfolds :: Int = min(10, div(size(yTr, 1), 3)),
folds :: Vector{Int} =
begin
n, r = divrem(size(yTr, 1), nfolds)
shuffle!([repeat(1:nfolds, outer=n); 1:r])
end,
parallel :: Bool=true)
Create and fit an ENLR
machine learning model, with training data 𝐏Tr
, of type ℍVector, and corresponding labels yTr
, of type IntVector. Return the fitted model(s) as an instance of the ENLR
structure.
As for all ML models acting in the tangent space, fitting an ENLR model involves computing a mean of all the matrices in 𝐏Tr
, mapping all matrices onto the tangent space after parallel transporting them at the identity matrix and vectorizing them using the vecP operation. Once this is done, the elastic net logistic regression is fitted.
The mean is computed according to the .metric
field of the model
, with optional weights w
. The .metric
field of the model
is passed to the tsMap
function. By default the metric is the Fisher metric. See the examples here below to see how to change metric. See mdm.jl or check out directly the documentation of PosDefManifold.jl for the available metrics.
Optional keyword arguments
By default, uniform weights will be given to all observations for computing the mean to pass in the tangent space. This is equivalent to passing as argument w=:uniform
(or w=:u
). You can also pass as argument:
w=:balanced
(or simplyw=:b
). If the two classes are unbalanced, the weights should be inversely proportional to the number of examples for each class, in such a way that each class contributes equally to the computation of the mean. This is equivalent of passingw=tsWeights(yTr)
. See thetsWeights
function for details.w=v
, wherev
is a user defined vector of non-negative weights for the observations, thus,v
must contain the same number of elements asyTr
. For example,w=[1.0, 1.0, 2.0, 2.0, ...., 1.0]
w=t
, wheret
is a 2-tuple of real weights, one weight for each class, for examplew=(0.5, 1.5)
. This is equivalent to passingw=tsWeights(yTr; classWeights=collect(t))
, see thetsWeights
function for details.
If meanISR
is passed as argument, the mean is not computed, instead this matrix is the inverse square root (ISR) of the mean used for projecting the matrices in the tangent space (see tsMap
). Passed or computed, the inverse square root (ISR) of the mean will be written in the .meanISR
field of the created ENLR
structure. If meanISRis
is not provided and the .metric
field of the model
is Fisher, logdet0 or Wasserstein, the tolerance of the iterative algorithm used to compute the mean is set to the argument passed as tol
(default 1e-5). Also, in this case a particular initialization for those iterative algorithms can be provided as an Hermitian
matrix with argument meanInit
.
This function also allows to fit a model passing as training data 𝐏Tr
directly a matrix of feature vectors, where each feature vector is a row of the matrix. In this case the metric
of the ENLR model and argument meanISR
are not used. Therefore, the .meanISR
field of the created ENLR
structure will be set to nothing
.
If a UnitRange
is passed with optional keyword argument vecRange
, then if 𝐏Tr
is a vector of Hermitian
matrices, the vectorization of those matrices once they are projected onto the tangent space concerns only the rows (or columns) given in the specified range, else if 𝐏Tr
is a matrix with feature vectors arranged in its rows, then only the columns of 𝐏Tr
given in the specified range will be used.
If fitType
= :best
(default), a cross-validation procedure is run to find the best lambda hyperparameter for the given training data. This finds a single model that is written into the .best
field of the ENLR
structure that will be created.
If fitType
= :path
, the regularization path for several values of the lambda hyperparameter is found for the given training data. This creates several models, which are written into the .path
field of the ENLR
structure that will be created, none of which is optimal, in the cross-validation sense, for the given training data.
If fitType
= :all
, both the above fits are performed and all fields of the ENLR
structure that will be created will be filled in.
If verbose
is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.
The ⏩
argument (true by default) is passed to the tsMap
function for projecting the matrices in 𝐏Tr
onto the tangent space and to the GLMNet.glmnetcv
function to run inner cross-validation to find the best
model using multi-threading.
The remaining optional keyword arguments, are
the arguments passed to the
GLMNet.glmnet
function for fitting the models. Those are always used.the
λSelMeth
argument and the arguments passed to theGLMNet.glmnetcv
function for finding the best lambda hyperparamater by cross-validation. Those are used only iffitType
=:path
or =:all
.
Optional keyword arguments for fitting the model(s) using GLMNet
alpha
: the hyperparameter in [0, 1] to trade-off an elestic-net model. α=0 requests a pure ridge model and α=1 a pure lasso model. This defaults to 1.0, which specifies a lasso model, unless the input ENLR
model
has another value in the alpha
field, in which case this value is used. If argument alpha
is passed here, it will overwrite the alpha
field of the input model
.
weights
: a vector of weights for each matrix (or feature vectors) of the same size as yTr
. It defaults to 1 for all matrices.
intercept
: whether to fit an intercept term. The intercept is always unpenalized. Defaults to true.
standardize
: if true (default), GLMNet standardize the predictors (presumably this amounts to transform to unit variance) so that they are in the same units. This is a common choice for regularized regression models.
penalty_factor
: a vector of length n(n+1)/2, where n is the dimension of the original PD matrices on which the model is applied, of penalties for each predictor in the tangent vectors. This defaults to all ones, which weights each predictor equally. To specify that a predictor should be unpenalized, set the corresponding entry to zero.
constraints
: an [n(n+1)/2] x 2 matrix specifying lower bounds (first column) and upper bounds (second column) on each predictor. By default, this is [-Inf Inf] for each predictor (each element of tangent vectors).
offset
: see documentation of original GLMNet package 🎓.
dfmax
: The maximum number of predictors in the largest model.
pmax
: The maximum number of predictors in any model.
nlambda
: The number of values of λ along the path to consider.
lambda_min_ratio
: The smallest λ value to consider, as a ratio of the value of λ that gives the null model (i.e., the model with only an intercept). If the number of observations exceeds the number of variables, this defaults to 0.0001, otherwise 0.01.
lambda
: The λ values to consider for fitting. By default, this is determined from nlambda
and lambda_min_ratio
.
tol
: Is the convergence criterion for both the computation of a mean for projecting onto the tangent space (if the metric requires an iterative algorithm) and for the GLMNet fitting algorithm. Defaults to 1e-5. In order to speed up computations, you may try to set a lower tol
; The convergence will be faster but more coarse, with a possible drop of classification accuracy, depending on the signal-to-noise ratio of the input features.
maxit
: The maximum number of iterations of the cyclic coordinate descent algorithm. If convergence is not achieved, a warning is returned.
algorithm
: the algorithm used to find the regularization path. Possible values are :newtonraphson
(default) and :modifiednewtonraphson
.
For further informations on those arguments, refer to the resources on the GLMNet package 🎓.
Optional Keyword arguments for finding the best model by cv
λSelMeth
= :sd1
(default), the best model is defined as the one allowing the highest cvλ.meanloss
within one standard deviation of the minimum, otherwise it is defined as the one allowing the minimum cvλ.meanloss
. Note that in selecting a model, the model with only the intercept term, if it exists, is ignored. See ENLRmodel
for a description of the .cvλ
field of the model structure.
Arguments nfolds
and folds
are passed to the GLMNet.glmnetcv
function along with the ⏩
argument. Please refer to the resources on GLMNet for details 🎓.
See: notation & nomenclature, the ℍVector type.
Tutorial: Example using the ENLR model.
Examples
using PosDefManifoldML, PosDefManifold
# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80, 0.1)
# Fit an ENLR lasso model and find the best model by cross-validation:
m=fit(ENLR(), PTr, yTr)
# ... balancing the weights for tangent space mapping
m=fit(ENLR(), PTr, yTr; w=tsWeights(yTr))
# ... using the log-Eucidean metric for tangent space projection
m=fit(ENLR(logEuclidean), PTr, yTr)
# Fit an ENLR ridge model and find the best model by cv:
m=fit(ENLR(Fisher), PTr, yTr; alpha=0)
# Fit an ENLR elastic-net model (α=0.9) and find the best model by cv:
m=fit(ENLR(Fisher), PTr, yTr; alpha=0.9)
# Fit an ENLR lasso model and its regularization path:
m=fit(ENLR(), PTr, yTr; fitType=:path)
# Fit an ENLR lasso model, its regularization path
# and the best model found by cv:
m=fit(ENLR(), PTr, yTr; fitType=:all)
function fit(model :: SVMmodel,
𝐏Tr :: Union{ℍVector, Matrix{Float64}},
yTr :: IntVector=[];
# parameters for projection onto the tangent space
w :: Union{Symbol, Tuple, Vector} = [],
meanISR :: Union{ℍ, Nothing} = nothing,
meanInit :: Union{ℍ, Nothing} = nothing,
vecRange :: UnitRange = 𝐏Tr isa ℍVector ? (1:size(𝐏Tr[1], 2)) : (1:size(𝐏Tr, 2)),
# SVM parameters
svmType :: Type = SVC,
kernel :: Kernel.KERNEL = RadialBasis,
epsilon :: Float64 = 0.1,
cost :: Float64 = 1.0,
gamma :: Float64 = 1/_getDim(𝐏Tr, vecRange),
degree :: Int64 = 3,
coef0 :: Float64 = 0.,
nu :: Float64 = 0.5,
shrinking :: Bool = true,
probability :: Bool = false,
weights :: Union{Dict{Int, Float64}, Nothing} = nothing,
cachesize :: Float64 = 200.0,
# Generic and common parameters
tol :: Real = 1e-5,
rescale :: Tuple = (-1, 1),
verbose :: Bool = true,
⏩ :: Bool = true)
Create and fit an SVM
machine learning model, with training data 𝐏Tr
, of type ℍVector, and corresponding labels yTr
, of type IntVector. The label vector can be omitted if the svmType
is OneClassSVM
(see SVM
). Return the fitted model as an instance of the SVM
structure.
As for all ML models acting in the tangent space, fitting an SVM model involves computing a mean of all the matrices in 𝐏Tr
, mapping all matrices onto the tangent space after parallel transporting them at the identity matrix and vectorizing them using the vecP operation. Once this is done, the support-vector machine is fitted.
Arguments w
, meanISR
, meanInit
and vecRange
allow to tune the projection onto the tangent space. See the documentation of the fit
function for the ENLR model here above for their meaning.
svmType
and kernel
allow to chose among several available SVM models. See the documentation of the SVM
structure.
epsilon
, with default 0.1, is the epsilon in loss function of the epsilonSVR
SVM model.
cost
, with default 1.0, is the cost parameter C of SVC
, epsilonSVR
, and nuSVR
SVM models.
gamma
, defaulting to 1 divided by the length of the feature vectors, is the γ parameter for RadialBasis
, Polynomial
and Sigmoid
kernels.
degree
, with default 3, is the degree for Polynomial
kernels
coef0
, zero by default, is a parameter for the Sigmoid
and Polynomial
kernel.
nu
, with default 0.5, is the parameter ν of nuSVC
, OneClassSVM
, and nuSVR
SVM models. It should be in the interval (0, 1].
shrinking
, true by default, sets whether to use the shrinking heuristics.
probability
, false by default sets whether to train a SVC
or SVR
model allowing probability estimates.
if a Dict{Int, Float64}
is passed as weights
argument, it will be used to give weights to the classes. By default it is equal to nothing
, implying equal weights to all classes.
cachesize
for the kernel, 200.0 by defaut (in MB), can be increased for very large problems.
tol
is the convergence criterion for both the computation of a mean for projecting onto the tangent space (if the metric recquires an iterative algorithm) and for the LIBSVM fitting algorithm. Defaults to 1e-5.
rescale
is a 2-tuple of the lower and upper limit to rescale the feature vectors within these limits. The default is (-1, 1), since tangent vectors of PD matrices have positive and negative elements. If 𝐏Tr
is a feature matrix and the features are only positive, use (0, 1) instead. In order not to rescale the feature vectors, use ().
If verbose
is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL. It may not work properly in a multithreaded context (see ⏩
argument here below).
The ⏩
argument (true by default) is passed to the tsMap
function for projecting the matrices in 𝐏Tr
onto the tangent space and to the LIBSVM function that perform the fit in order to run them in multi-threaded mode.
For further information on tho LIBSVM arguments, refer to the resources on the LIBSVM package 🎓.
See: notation & nomenclature, the ℍVector type.
Tutorial: Example using SVM models.
Examples
using PosDefManifoldML, PosDefManifold
# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80, 0.1);
# Fit an SVC SVM model and find the best model by cross-validation:
m=fit(SVM(), PTr, yTr)
# ... balancing the weights for tangent space mapping
m=fit(SVM(), PTr, yTr; w=:b)
# ... using the log-Eucidean metric for tangent space projection
m=fit(SVM(logEuclidean), PTr, yTr)
# ... using the linear kernel
m=fit(SVM(logEuclidean), PTr, yTr, kernel=Linear)
# or
m=fit(SVM(logEuclidean; kernel=Linear), PTr, yTr)
# ... using the Nu-Support Vector Classification
m=fit(SVM(logEuclidean), PTr, yTr, kernel=Linear, svmtype=NuSVC)
# or
m=fit(SVM(logEuclidean; kernel=Linear, svmtype=NuSVC), PTr, yTr)
# N.B. all other keyword arguments must be passed to the fit function
# and not to the SVM constructor.
StatsBase.predict
— Functionfunction predict(model :: MDMmodel,
𝐏Te :: ℍVector,
what :: Symbol = :labels;
verbose :: Bool = true,
⏩ :: Bool = true)
Given an MDM
model
trained (fitted) on z classes and a testing set of k positive definite matrices 𝐏Te
of type ℍVector,
if what
is :labels
or :l
(default), return the predicted class labels for each matrix in 𝐏Te
, as an IntVector. For MDM models, the predicted class 'label' of an unlabeled matrix is the serial number of the class whose mean is the closest to the matrix (minimum distance to mean). The labels are '1' for class 1, '2' for class 2, etc;
if what
is :probabilities
or :p
, return the predicted probabilities for each matrix in 𝐏Te
to belong to a all classes, as a k-vector of z vectors holding reals in [0, 1]m (probabilities). The 'probabilities' are obtained passing to a softmax function minus the squared distances of each unlabeled matrix to all class means;
if what
is :f
or :functions
, return the output function of the model. The ratio of the squared distance to all classes to their geometric mean gives the 'functions'.
If verbose
is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.
It f ⏩
is true (default), the computation of distances is multi-threaded.
See: notation & nomenclature, the ℍVector type.
See also: fit
, cvAcc
, predictErr
.
Examples
using PosDefManifoldML, PosDefManifold
# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80)
# craete and fit an MDM model
m=fit(MDM(Fisher), PTr, yTr)
# predict labels
yPred=predict(m, PTe, :l)
# prediction error
predErr=predictErr(yTe, yPred)
# predict probabilities
predict(m, PTe, :p)
# output functions
predict(m, PTe, :f)
function predict(model :: ENLRmodel,
𝐏Te :: Union{ℍVector, Matrix{Float64}},
what :: Symbol = :labels,
fitType :: Symbol = :best,
onWhich :: Int = Int(fitType==:best);
transfer :: Union{ℍ, Nothing} = nothing,
verbose :: Bool = true,
⏩ :: Bool = true)
Given an ENLR
model
trained (fitted) on 2 classes and a testing set of k positive definite matrices 𝐏Te
of type ℍVector,
if what
is :labels
or :l
(default), return the predicted class labels for each matrix in 𝐏Te
, as an IntVector. Those labels are '1' for class 1 and '2' for class 2;
if what
is :probabilities
or :p
, return the predicted probabilities for each matrix in 𝐏Te
to belong to each classe, as a k-vector of z vectors holding reals in [0, 1] (probabilities). The 'probabilities' are obtained passing to a softmax function the output of the ENLR model and zero;
if what
is :f
or :functions
, return the output function of the model, which is the raw output of the ENLR model.
If fitType
= :best
(default), the best model that has been found by cross-validation is used for prediction.
If fitType
= :path
,
- if
onWhich
is a valid serial number for a model in themodel.path
,
then this model is used for prediction,
- if
onWhich
is zero, all model in themodel.path
will be used for
predictions, thus the output will be multiplied by the number of models in model.path
.
Argumet onWhich
has no effect if fitType
= :best
.
Optional keyword argument transfer
can be used to specify the principal inverse square root (ISR) of a new mean to be used as base point for projecting the matrices in 𝐏Te
onto the tangent space. By default transfer
is equal to nothing, implying that the base point will be the mean used to fit the model. Passing a new mean ISR allows the adaptation first described in Barachant et al.(2013). Typically transfer
is the ISR of the mean of the matrices in 𝐏Te
or of a subset of them. Notice that this actually performs transfer learning by parallel transporting both the training and test data to the identity matrix as defined in Zanini et al.(2018) and later taken up in Rodrigues et al.(2019)🎓.
If verbose
is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.
If ⏩ = true (default) and 𝐏Te
is an ℍVector type, the projection onto the tangent space is multi-threaded.
See: notation & nomenclature, the ℍVector type.
See also: fit
, cvAcc
, predictErr
.
Examples
using PosDefManifoldML, PosDefManifold
# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80)
# fit an ENLR lasso model and find the best model by cv
m=fit(ENLR(Fisher), PTr, yTr)
# predict labels from the best model
yPred=predict(m, PTe, :l)
# prediction error
predErr=predictErr(yTe, yPred)
# predict probabilities from the best model
predict(m, PTe, :p)
# output functions from the best model
predict(m, PTe, :f)
# fit a regularization path for an ENLR lasso model
m=fit(ENLR(Fisher), PTr, yTr; fitType=:path)
# predict labels using a specific model
yPred=predict(m, PTe, :l, :path, 10)
# predict labels for all models
yPred=predict(m, PTe, :l, :path, 0)
# prediction error for all models
predErr=[predictErr(yTe, yPred[:, i]) for i=1:size(yPred, 2)]
# predict probabilities from a specific model
predict(m, PTe, :p, :path, 12)
# predict probabilities from all models
predict(m, PTe, :p, :path, 0)
# output functions from specific model
predict(m, PTe, :f, :path, 3)
# output functions for all models
predict(m, PTe, :f, :path, 0)
function predict(model :: SVMmodel,
𝐏Te :: Union{ℍVector, Matrix{Float64}},
what :: Symbol = :labels;
transfer :: Union{ℍ, Nothing} = nothing,
verbose :: Bool = true,
⏩ :: Bool = true)
Given an SVM
model
trained (fitted) on 2 classes and a testing set of k positive definite matrices 𝐏Te
of type ℍVector,
For the meaning of arguments what
, transfer
and verbose
, see the documentation of the predict
function for the ENLR model.
If ⏩ = true (default) and 𝐏Te
is an ℍVector type, the projection onto the tangent space will be multi-threaded. Also, the prediction of the LIBSVM function will be multi-threaded.
See: notation & nomenclature, the ℍVector type.
See also: fit
, cvAcc
, predictErr
.
Examples
using PosDefManifoldML, PosDefManifold
# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80)
# fit an SVM model
m=fit(SVM(Fisher), PTr, yTr)
# predict labels
yPred=predict(m, PTe, :l)
# prediction error
predErr=predictErr(yTe, yPred)
# predict probabilities
predict(m, PTe, :p)
# output functions
predict(m, PTe, :f)
PosDefManifoldML.cvAcc
— Functionfunction cvAcc(model :: MLmodel,
𝐏Tr :: ℍVector,
yTr :: IntVector;
nFolds :: Int = min(10, length(yTr)÷3),
scoring :: Symbol = :b,
shuffle :: Bool = false,
verbose :: Bool = true,
outModels :: Bool = false,
⏩ :: Bool = true,
fitArgs...)
Cross-validation accuracy for a machine learning model
: given an ℍVector 𝐏Tr
holding k Hermitian matrices, an IntVector yTr
holding the k labels for these matrices and the number of folds nFolds
, return a CVacc
structure.
optional keyword arguments
nFolds
by default is set to the minimum between 10 and the number of observation ÷ 3 (integer division).
If scoring
=:b (default) the balanced accuracy is computed. Any other value will make the function returning the regular accuracy. Balanced accuracy is to be preferred for unbalanced classes. For balanced classes the balanced accuracy reduces to the regular accuracy, therefore there is no point in using regular accuracy if not to avoid a few unnecessary computations when the class are balanced.
For the meaning of the shuffle
argument (false by default), see function cvSetup
, to which this argument is passed.
If verbose
is true (default), information is printed in the REPL. This option is included to allow repeated calls to this function without crowding the REPL.
If outModels
is true return a 2-tuple holding a CVacc
structure and a nFolds
-vector of the model fitted for each fold, otherwise (default), return only a CVacc
structure.
If ⏩
the folds and some other computations are multi-threaded. It is true by default. Set it to false if there are problems in running this function.
fitArgs
are optional keyword arguments that are passed to the fit
function called for each fold of the cross-validation. For each machine learning model, all optional keyword arguments of their fit method are elegible to be passed here, however, the arguments listed in the following table for each model should not be passed. Note that if they are passed, they will be disabled:
MDM/MDMF | ENLR | SVM |
---|---|---|
verbose | verbose | verbose |
⏩ | ⏩ | ⏩ |
meanISR | meanISR | |
meanInit | meanInit | |
fitType | ||
offsets | ||
lambda | ||
folds |
Also, if you pass a w
(weights for tangent space projection) argument, do not pass a vector of weights, just pass a symbol, e.g., w=:b
for balancing weights.
See: notation & nomenclature, the ℍVector type.
Examples
using PosDefManifoldML, PosDefManifold
# generate some data
PTr, PTe, yTr, yTe=gen2ClassData(10, 30, 40, 60, 80)
# perform 10-fold cross-validation using the minimum distance to mean classifier
cv=cvAcc(MDM(Fisher), PTr, yTr)
# ...using the lasso logistic regression classifier
cv=cvAcc(ENLR(Fisher), PTr, yTr)
# ...using the support-vector machine classifier
cv=cvAcc(SVM(Fisher), PTr, yTr)
# ...With a Polynomial kernel of order 3 (default)
cv=cvAcc(SVM(Fisher), PTr, yTr; kernel=kernel.Polynomial)
# perform 8-fold cross-validation instead
# (and see that you can go pretty fast if your PC has 8 threads)
cv=cvAcc(SVM(Fisher), PTr, yTr; nFolds=8)
# ...balance the weights for tangent space projection
cv=cvAcc(ENLR(Fisher), PTr, yTr; nFolds=8, w=:b)
# perform another cross-validation shuffling the folds
cv=cvAcc(ENLR(Fisher), PTr, yTr; shuffle=true, nFolds=8, w=:b)
PosDefManifoldML.cvSetup
— Functionfunction cvSetup(k :: Int,
nCV :: Int;
shuffle :: Bool = false)
Given k
elements and a parameter nCV
, a nCV-fold cross-validation is obtained defining nCV
permutations of k elements in nTest=k÷nCV (integer division) elements for the test and k-nTest elements for the training, in such a way that each element is represented in only one permutation.
Said differently, given a length k
and the number of desired cross-validations nCV
, this function generates indices from the sequence of natural numbers 1,..,k to obtain all nCV-fold cross-validation sets. Specifically, it generates nCV
vectors of indices for generating test sets and nCV
vectors of indices for geerating training sets.
If optional keyword argument shuffle
is true, the sequence of natural numbers 1,..,k is shuffled before running the function, thus in this case two successive runs of this function will give different cross-validation sets, hence different accuracy scores. By default shuffle
is false, so as to allow exactly the same result in successive runs. Note that no random initialization for the shuffling is provided, so as to allow the replication of the same random sequences starting again the random generation from scratch.
This function is used in cvAcc
. It constitutes the fundamental basis to implement customized cross-validation procedures.
Return the 2-tuple with:
- A vector of
nCV
vectors holding the indices for the training sets, - A vector of
nCV
vectors holding the indices for the corresponding test sets.
Examples
using PosDefManifoldML, PosDefManifold
cvSetup(10, 2)
# return:
# (Array{Int64,1}[[6, 7, 8, 9, 10], [1, 2, 3, 4, 5]],
# Array{Int64,1}[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
cvSetup(10, 2, shuffle=true)
# return:
# (Array{Int64,1}[[5, 4, 6, 1, 9], [3, 7, 8, 2, 10]],
# Array{Int64,1}[[3, 7, 8, 2, 10], [5, 4, 6, 1, 9]])
cvSetup(10, 3)
# return:
# (Array{Int64,1}[[4, 5, 6, 7, 8, 9, 10], [1, 2, 3, 7, 8, 9, 10], [1, 2, 3, 4, 5, 6]],
# Array{Int64,1}[[1, 2, 3], [4, 5, 6], [7, 8, 9, 10]])