mdm.jl

This unit implements the Riemannian MDM (Minimum Distance to Mean) classifier for the manifold of positive definite (PD) matrices, both real (symmetric PD) or complex (Hermitian PD) matrices. The MDM is a simple, yet efficient, deterministic and paramater-free classifier acting directly on the manifold of positive definite matrices (Barachat el al., 2012; Congedo et al., 2017a 🎓): given a number of PD matrices representing class means, the MDM classify an unknown datum (also a PD matrix) as belonging to the class whose mean is the closest to the datum. The process is illustrated in the upper part of this figure.

The MDM classifier involves only the concepts of a distance function for two PD matrices and a mean (center of mass or barycenter) for a number of them. Those are defined for any given metric, a Metric enumerated type declared in PosDefManifold.

Currently supported metrics are:

metric (distance)mean estimationknown also as
EuclideanarithmeticArithmetic
invEuclideanharmonic
ChoEuclideanCholesky Euclidean
logEuclideanlog-Euclidean
logCholeskylog-Cholesky
FisherFisherCartan, Karcher, Pusz-Woronowicz, Affine-Invariant, ...
logdet0logDetS, α, Bhattacharyya, Jensen, ...
JeffreyJeffreysymmetrized Kullback-Leibler
WassersteinWassersteinBures, Hellinger, optimal transport, ...

Do not use the Von Neumann metric, which is also supported in PosDefManifold, since it does not allow a definition of mean. See here for details on the metrics. In order to use these metrics you need to install the PosDefManifold package.

The fit, predict and crval functions for the MDM model are reported in the cv.jl unit, since those are homogeneous across all machine learning models. Here it is reported the MDMmodel abstract type, the MDM structure and the following functions, which typically you will not need to access directly, but are nonetheless provided to facilitate low-level operations with MDM classifiers:

functiondescription
barycentercompute the barycenter of positive definite matrices for fitting the MDM model
distancescompute the distances of a matrix set to a set of means
PosDefManifoldML.MDMType
mutable struct MDM <: MDMmodel
    metric  :: Metric = Fisher;
    pipeline :: Pipeline
    featDim :: Int
    means   :: ℍVector
    imeans  :: ℍVector
end

MDM machine learning models are incapsulated in this mutable structure. MDM models have four fields: .metric, .pipeline, .featDim and .means.

The field metric, of type Metric, is to be specified by the user. It is the metric that will be adopted to compute the class means and the distances to the mean.

All other fields do not correspond to arguments passed upon creation of the model by the default creator. Instead, they are filled later when a model is created by the fit function:

The field pipeline, of type Pipeline, holds an optional sequence of data pre-conditioners to be applied to the data. The pipeline is learnt when a ML model is fitted - see fit - and stored in the model. If the pipeline is fitted, it is automatically applied to the data at each call of the predict function.

The field featDim is the dimension of the manifold in which the model acts. This is given by n(n+1)/2, where n is the dimension of the PD matrices. This field is not to be specified by the user, instead, it is computed when the MDM model is fit using the fit function and is accessible only thereafter.

The field means is an ℍVector holding the class means, i.e., one mean for each class. This field is not to be specified by the user, instead, the means are computed when the MDM model is fitted using the fit function and are accessible only thereafter.

The field imeans is an ℍVector holding the inverse of the matrices in means. This also is not to be specified by the user, is computed when the model is fitted and is accessible only thereafter. It is used to optimize the computation of distances if the model is fitted useing the Fisher metric (default).

Examples:

using PosDefManifoldML, PosDefManifold

# Create an empty model
m = MDM(Fisher)

# Since the Fisher metric is the default metric,
# this is equivalent to
m = MDM()

Note that in general you need to invoke these constructors only when an empty MDM model is needed as an argument to a function, otherwise you can more simply create and fit an MDM model using the fit function.

PosDefManifoldML.barycenterFunction
function barycenter(metric :: Metric, 𝐏:: ℍVector;
              w        :: Vector = [],
              ✓w       :: Bool    = true,
              meanInit :: Union{ℍ, Nothing} = nothing,
              tol      :: Real   = 0.,
              ⏩      :: Bool    = true)

Typically, you will not need this function as it is called by the fit function.

Given a metric of type Metric, an ℍVector of Hermitian matrices $𝐏$ and an optional non-negative real weights vector w, return the (weighted) mean of the matrices in $𝐏$. This is used to fit MDM models.

This function calls the appropriate mean functions of package PostDefManifold, depending on the chosen metric, and check that, if the mean is found by an iterative algorithm, then the iterative algorithm converges.

See method (3) of the mean function for the meaning of the optional keyword arguments w, ✓w, meanInit, tol and , to which they are passed.

The returned mean is flagged by Julia as an Hermitian matrix (see LinearAlgebra).

PosDefManifoldML.distancesFunction
function distances(metric :: Metric,
                      means  :: ℍVector,
                      𝐏      :: ℍVector;
                imeans  :: Union{ℍVector, Nothing} = nothing,
                scale   :: Bool = false,
                ⏩      :: Bool = true)

Typically, you will not need this function as it is called by the predict function.

Given an ℍVector $𝐏$ holding k Hermitian matrices and an ℍVector means holding z matrix means, return the square of the distance of each matrix in $𝐏$ to the means in means.

The squared distance is computed according to the chosen metric, of type Metric. See metrics for details on the supported distance functions.

The computation of distances is optimized for the Fisher metric if an ℍVector holding the inverse of the means in means is passed as optional keyword argument imeans. For other metrics this argument is ignored.

If scale is true (default), the distances are divided by the size of the matrices in $𝐏$. This can be useful to compare distances computed on manifolds with different dimensions. It has no effect here, but is used as it is good practice.

If is true, the distances are computed using multi-threading, unless the number of threads Julia is instructed to use is <2 or <3k.

The result is a zxk matrix of squared distances.