# riemannianGeometry.jl

This is the fundamental unit of **PosDefManifold**. It contains functions for manipulating points in the Riemannian manifold of *Symmetric Positive Definite (SPD)* or *Hermitian Positive Definite (HPD)* matrices. In Julia those are `Hermitian`

matrices, see typecasting matrices.

The functions are divided in six categories:

Category | Output |
---|---|

1. Geodesic equations | interpolation, extrapolation, weighted mean of two matrices, ... |

2. Distances | length of geodesics |

3. Graphs and Laplacians | inter-distance matrices, spectral embedding, eigenmaps, ... |

4. Means | mid-points of geodesics, Fréchet means of several points, midrange,... |

5. Tangent Space operations | maps from the manifold to the tangent space and viceversa, parallel transport,... |

6. Procrustes problems | data matching, transfer learning (domain adaptation), ... |

⋅

## Geodesic equations

Function | Description |
---|---|

`geodesic` | Geodesic equations (weighted mean of two positive definite matrices) for any metric |

⋅

`PosDefManifold.geodesic`

— Function```
(1) geodesic(metric::Metric, P::ℍ{T}, Q::ℍ{T}, a::Real) where T<:RealOrComplex
(2) geodesic(metric::Metric, D::𝔻{S}, E::𝔻{S}, a::Real) where S<:Real
```

(1) Move along the geodesic from point $P$ to point $Q$ (two positive definite matrices) with *arclegth* $0<=a<=1$, using the specified metric, of type Metric::Enumerated type.

For all metrics,

- with $a=0$ we stay at $P$,
- with $a=1$ we move up to $Q$,
- with $a=1/2$ we move to the mid-point of $P$ and $Q$ (mean).

Using the Fisher metric, argument $a$ can be *any* real number, for instance:

- with $0<a<1$ we move toward $Q$ (
*attraction*), - with $a>1$ we move over and beyond $Q$ (
*extrapolation*), - with $a<0$ we move back away from Q (
*repulsion*).

$P$ and $Q$ must be flagged by julia as `Hermitian`

. See typecasting matrices.

The Fisher geodesic move is computed by the Cholesky-Schur algorithm given in Eq. 4.2 by Iannazzo(2016)🎓. If $Q=I$, the Fisher geodesic move is simply $P^a$ (no need to call this funtion).

For the logdet zero and Jeffrey metric no closed form expression for the geodesic is available to the best of authors' knowledge, so in this case the geodesic is found as the weighted mean using the `mean`

function. For the Von Neumann not even an expression for the mean is available, so in this case the geodesic is not provided and a *warning* is printed.

(2) Like in (1), but for two real positive definite diagonal matrices $D$ and $E$.

**Maths**

For points $P$, $Q$ and arclength $a$, letting $b=1-a$, the geodesic equations for the supported metrics are:

Metric | geodesic equation |
---|---|

Euclidean | $bP + aQ$ |

invEuclidean | $\big(bP^{-1} + aQ^{-1}\big)^{-1}$ |

ChoEuclidean | $TT^*$, where $T=bL_P + aL_Q$ |

logEuclidean | $\text{exp}\big(b\hspace{2pt}\text{log}(P) + a\hspace{2pt}\text{log}(Q)\big)$ |

logCholesky | $TT^*$, where $T=S_P+a(S_Q-S_P)+D_P\hspace{2pt}\text{exp}\big(a(\text{log}D_Q-\text{log}D_P)\big)$ |

Fisher | $P^{1/2} \big(P^{-1/2} Q P^{-1/2}\big)^a P^{1/2}$ |

logdet0 | uses weighted mean algorithm `logdet0Mean` |

Jeffrey | uses weighted mean `mean` |

VonNeumann | N.A. |

Wasserstein | $b^2P+a^2Q +ab\big[(PQ)^{1/2} +(QP)^{1/2}\big]$ |

**legend:** $L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$).

**See also**: `mean`

.

**Examples**

```
using PosDefManifold
P=randP(10)
Q=randP(10)
# Wasserstein mean
M=geodesic(Wasserstein, P, Q, 0.5)
# extrapolate suing the Fisher metric
E=geodesic(Fisher, P, Q, 2)
```

## Distances

Function | Description |
---|---|

`distanceSqr` , `distance²` | Squared distance between positive definite matrices |

`distance` | Distance between positive definite matrices |

⋅

`PosDefManifold.distanceSqr`

— Function```
(1) distanceSqr(metric::Metric, P::ℍ{T}) where T<:RealOrComplex
(2) distanceSqr(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(3) distanceSqr(metric::Metric, D::𝔻{S}) where S<:Real
(4) distanceSqr(metric::Metric, D::𝔻{S}, E::𝔻{S}) where S<:Real
```

**alias**: `distance²`

(1) Return $δ^2(P, I)$, the *square of the distance* (or *divergence*) of positive definite matrix $P$ from the the identity matrix. See distance from the origin.

(2) Return $δ^2(P, Q)$, the *square of the distance* (or *divergence*) between two positive definite matrices $P$ and $Q$. See distance.

In both cases the distance function $δ$ is induced by the argument `metric`

of type Metric::Enumerated type.

$P$ in (1) and $P$, $Q$ in (2) must be flagged by julia as `Hermitian`

. See typecasting matrices.

(3) and (4) are specialized methods of (1) and (2), respectively, for real positive definite `Diagonal`

matrices. See ℍVector type and 𝔻Vector type.

**Maths**

For point $P$ the *squared distances from the identity* for the supported metrics are:

Metric | Squared Distance from the identity |
---|---|

Euclidean | $∥P-I∥^2$ |

invEuclidean | $∥P^{-1}-I∥^2$ |

ChoEuclidean | $∥L_P-I∥^2$ |

logEuclidean | $∥\textrm{log}P∥^2$ |

logCholesky | $∥S_P∥^2+∥\textrm{log}D_P∥^2$ |

Fisher | $∥\textrm{log}P∥^2$ |

logdet0 | $\textrm{logdet}\frac{1}{2}(P+I) - \frac{1}{2}\textrm{logdet}(P)$ |

Jeffrey | $\frac{1}{2}\textrm{tr}(P+P^{-1})-n$ |

VonNeumann | $\frac{1}{2}\textrm{tr}(P\textrm{log}P-\textrm{log}P)$ |

Wasserstein | $\textrm{tr}(P+I) -2\textrm{tr}(P^{1/2})$ |

For points $P$ and $Q$ their *squared distances* for the supported metrics are:

Metric | Squared Distance |
---|---|

Euclidean | $∥P-Q∥^2$ |

invEuclidean | $∥P^{-1}-Q^{-1}∥^2$ |

ChoEuclidean | $∥ L_P - L_Q ∥^2$ |

logEuclidean | $∥\textrm{log}P-\textrm{log}Q∥^2$ |

logCholesky | $∥S_P-S_Q∥^2+∥\textrm{log}D_P-\textrm{log}D_Q∥^2$ |

Fisher | $∥\textrm{log}(P^{-1/2}QP^{-1/2})∥^2$ |

logdet0 | $\textrm{logdet}\frac{1}{2}(P+Q) - \frac{1}{2}\textrm{logdet}(PQ)$ |

Jeffrey | $\frac{1}{2}\textrm{tr}(Q^{-1}P+P^{-1}Q)-n$ |

VonNeumann | $\frac{1}{2}\textrm{tr}(P\textrm{log}P-P\textrm{log}Q+Q\textrm{log}Q-Q\textrm{log}P)$ |

Wasserstein | $\textrm{tr}(P+Q) -2\textrm{tr}(P^{1/2}QP^{1/2})^{1/2}$ |

**legend:** $L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$).

**See also**: `distanceSqrMat`

.

**Examples (1)**

```
using PosDefManifold
P=randP(10)
d=distanceSqr(Wasserstein, P)
e=distanceSqr(Fisher, P)
metric=Metric(Int(logdet0)) # or metric=logdet0
s=string(metric) # check what is the current metric
f=distance²(metric, P) #using the alias distance²
```

**Examples (2)**

```
using PosDefManifold
P=randP(10)
Q=randP(10)
d=distanceSqr(logEuclidean, P, Q)
e=distance²(Jeffrey, P, Q)
```

`PosDefManifold.distance`

— Function```
(1) distance(metric::Metric, P::ℍ{T}) where T<:RealOrComplex
(2) distance(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(3) distance(metric::Metric, D::𝔻{S}) where S<:Real
(4) distance(metric::Metric, D::𝔻{S}, E::𝔻{S}) where S<:Real
```

(1) Return $δ(P, I)$, the *distance* between positive definite matrix $P$ and the identity matrix.

(2) Return $δ(P, Q)$, the *distance* between positive definite matrices $P$ and $Q$.

(3) and (4) are specialized methods of (1) and (2), respectively, for real positive definite `Diagonal`

matrices.

This is the square root of `distanceSqr`

and is invoked with the same syntax therein.

**See also**: `distanceMat`

.

## Graphs and Laplacians

Function | Description |
---|---|

`distanceSqrMat` , `distance²Mat` | Lower triangular matrix of all squared inter-distances |

`distanceMat` | Lower triangular matrix of all inter-distances |

`laplacian` | Laplacian of a squared inter-distances matrix |

`laplacianEigenMaps` , `laplacianEM` | Eigen maps (eigenvectors) of a Laplacian |

`spectralEmbedding` , `spEmb` | Spectral Embedding (the above functions run in series) |

⋅

`PosDefManifold.distanceSqrMat`

— Function```
(1) distanceSqrMat(metric::Metric, 𝐏::ℍVector;
<⏩=true>)
(2) distanceSqrMat(type::Type{T}, metric::Metric, 𝐏::ℍVector;
<⏩=true>) where T<:AbstractFloat
```

**alias**: `distance²Mat`

Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ of ℍVector type, create the $k⋅k$ real `LowerTriangular`

matrix comprising elements $δ^2(P_i, P_j)\textrm{, for all }i>=j$.

This is the lower triangular matrix holding all *squared inter-distances* (zero on diagonal), using the specified `metric`

, of type Metric::Enumerated type, giving rise to distance function $δ$. See `distanceSqr`

.

Only the lower triangular part is computed in order to optimize memory use.

By default, the result matrix is of type `Float32`

. The type can be changed to another real `type`

using method (2).

<optional keyword arguments>:

- if ⏩=true (default) the computation of inter-distances is multi-threaded.

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

**See**: distance.

**See also**: `laplacian`

, `laplacianEigenMaps`

, `spectralEmbedding`

.

**Examples**

```
using PosDefManifold
# Generate a set of 8 random 10x10 SPD matrices
Pset=randP(10, 8) # or, using unicode: 𝐏=randP(10, 8)
# Compute the squared inter-distance matrix according to the log Euclidean metric.
# This is much faster as compared to the Fisher metric and in general
# it is a good approximation.
Δ²=distanceSqrMat(logEuclidean, Pset)
# return a matrix of type Float64
Δ²64=distanceSqrMat(Float64, logEuclidean, Pset)
# Get the full matrix of inter-distances
fullΔ²=Hermitian(Δ², :L)
```

`PosDefManifold.distanceMat`

— Function```
(1) distanceMat(metric::Metric, 𝐏::ℍVector;
<⏩=true>)
(2) distanceMat(type::Type{T}, metric::Metric, 𝐏::ℍVector;
<⏩=true>) where T<:AbstractFloat
```

Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ of ℍVector type, create the $k⋅k$ real `LowerTriangular`

matrix comprising elements $δ(P_i, P_j)\textrm{, for all }i>=j$.

This is the lower triangular matrix holding all *inter-distances* (zero on diagonal), using the specified `metric`

, of type Metric::Enumerated type, giving rise to distance $δ$. See `distance`

.

Only the lower triangular part is computed in order to optimize memory use.

By default, the result matrix is of type `Float32`

. The type can be changed to another real `type`

using method (2).

The elements of this matrix are the square root of `distanceSqrMat`

.

<optional keyword arguments>:

- if ⏩=true the computation of inter-distances is multi-threaded.

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

**See**: distance.

**Examples**

```
using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
Δ=distanceMat(Fisher, Pset)
# return a matrix of type Float64
Δ64=distanceMat(Float64, Fisher, Pset)
# Get the full matrix of inter-distances
fullΔ=Hermitian(Δ, :L)
```

`PosDefManifold.laplacian`

— Function```
laplacian(Δ²::𝕃{S}, epsilon::Real=0;
<densityInvariant=false>) where S<:Real
```

Given a `LowerTriangular`

matrix of squared inter-distances $Δ^2$, return the lower triangular part of the so-called *normalized Laplacian* or *density-invariant normalized Laplacian*, which in both cases is a symmetric Laplacian. The elements of the Laplacian are of the same type as the elements of $Δ^2$. The result is a `LowerTriangular`

matrix.

The definition of Laplacian given by Lafon (2004)🎓 is implemented:

First, a Gaussian radial basis functions, known as *Gaussian kernel* or *heat kernel*, is applied to all elements of $Δ^2$, such as

$W_{ij} = exp\bigg(\frac{\displaystyle{-Δ^2_{ij}}}{\displaystyle{2ε}}\bigg)$,

where $ε$ is the *bandwidth* of the kernel.

If <optional keyword argument> `densityInvariant=true`

is used, then the density-invariant transformation is applied

$W \leftarrow E^{-1}WE^{-1}$

where $E$ is the diagonal matrix holding on the main diagonal the sum of the rows (or columns) of $W$.

Finally, the normalized Laplacian (density-invariant or not) is defined as

$Ω = D^{-1/2}WD^{-1/2}$,

where $D$ is the diagonal matrix holding on the main diagonal the sum of the rows (or columns) of $W$.

If you do not provide argument `epsilon`

, the bandwidth $ε$ is set to the median of the elements of squared distance matrix $Δ^2_{ij}$. Another educated guess is the dimension of the original data, that is, the data that has been used to compute the squared distance matrix. For positive definite matrices this is $n(n-1)/2$, where $n$ is the dimension of the matrices. Still another is the dimension of the ensuing `spectralEmbedding`

space. Keep in mind that by tuning the `epsilon`

parameter (which must be positive) you can control both the rate of compression of the embedding space and the spread of points in the embedding space. See Coifman et *al.* (2008)🎓 for a discussion on $ε$.

The Laplacian as here defined can be requested for any input matrix of squared inter-distances, for example, those obtained on scalars or on vectors using appropriate metrics. In any case, only the lower triangular part of the Laplacian is taken as input. See typecasting matrices.

**See also**: `distanceSqrMat`

, `laplacianEigenMaps`

, `spectralEmbedding`

.

**Examples**

```
using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
Δ²=distanceSqrMat(Fisher, Pset)
Ω=laplacian(Δ²)
# density-invariant Laplacian
Ω=laplacian(Δ²; densityInvariant=true)
# increase the bandwidth
r=size(Δ², 1)
myεFactor=0.1
med=Statistics.median([Δ²[i, j] for j=1:r-1 for i=j+1:r])
ε=2*myεFactor*med
Ω=laplacian(Δ², ε; densityInvariant=true)
```

`PosDefManifold.laplacianEigenMaps`

— Function```
laplacianEigenMaps(Ω::𝕃{S}, q::Int;
<
tol::Real=0.,
maxiter::Int=300,
verbose=false >) where S<:Real
```

**alias**: `laplacianEM`

Given the lower triangular part of a Laplacian $Ω$ (see `laplacian`

) return the *eigen maps* in $q$ dimensions, i.e., the $q$ eigenvectors of the Laplacian associated with the largest $q$ eigenvalues, excluding the first (which is always equal to 1.0). The eigenvectors are of the same type as $Ω$. They are all divided element-wise by the first eigenvector (see Lafon, 2004🎓).

The eigenvectors of the Laplacian are computed by the power iterations+modified Gram-Schmidt method (see `powerIterations`

), allowing the execution of this function for Laplacian matrices of very large size.

Return the 4-tuple $(Λ, U, iterations, convergence)$, where:

- $Λ$ is a $q⋅q$ diagonal matrix holding on diagonal the eigenvalues corresponding to the $q$ dimensions of the Laplacian eigen maps,
- $U$ holds in columns the $q$ eigenvectors holding the $q$ coordinates of the points in the embedding space,
- $iterations$ is the number of iterations executed by the power method,
- $convergence$ is the convergence attained by the power method.

Using the notion of Laplacian, spectral embedding seek a low-dimension representation of the data emphasizing local neighbothood information while neglecting long-distance information. The embedding is non-linear, however the embedding space is Euclidean. The eigenvectors of $U$ holds the coordinates of the points in the embedding space (typically two- or three-dimensional for plotting or more for clustering). Spectral embedding is done for plotting data in low-dimension, clustering, imaging, classification, following their trajectories over time or other dimensions, and much more. For examples of applications see Ridrigues et *al.* (2018) 🎓 and references therein.

**Arguments**:

- $Ω$ is a real
`LowerTriangular`

normalized Laplacian obtained by the`laplacian`

function, - $q$ is the dimension of the Laplacian eigen maps;
- The following are
*<optional keyword arguments>*for the power iterations:`tol`

is the tolerance for convergence (see below),`maxiter`

is the maximum number of iterations allowed,- if
`verbose`

is true, the convergence at all iterations will be printed.

The maximum value of $q$ that can be requested is $n-1$, where $n$ is the size of the Laplacian. In general, $q=2$ or $q=3$ is requested.

$tol$ defaults to the square root of `Base.eps`

of the (real) type of $Ω$. This corresponds to requiring equality for the convergence criterion over two successive power iterations of about half of the significant digits.

**See also**: `distanceSqrMat`

, `laplacian`

, `spectralEmbedding`

.

**Examples**

```
using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4)
Δ²=distanceSqrMat(Fisher, Pset)
Ω=laplacian(Δ²)
evalues, maps, iterations, convergence=laplacianEM(Ω, 2)
evalues, maps, iterations, convergence=laplacianEM(Ω, 2; verbose=true)
evalues, maps, iterations, convergence=laplacianEM(Ω, 2; verbose=true, maxiter=500)
```

`PosDefManifold.spectralEmbedding`

— Function```
(1) spectralEmbedding(metric::Metric, 𝐏::ℍVector, q::Int, epsilon::Real=0;
<
tol::Real=0.,
maxiter::Int=300,
densityInvariant=false,
verbose=false,
⏩=true >)
(2) spectralEmbedding(type::Type{T}, metric::Metric, 𝐏::ℍVector, q::Int, epsilon::Real=0;
< same optional keyword arguments as in (1) >) where T<:Real
```

**alias**: `spEmb`

Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ (real or complex), compute its *eigen maps* in $q$ dimensions.

This function runs one after the other the functions:

`distanceSqrMat`

(compute the squared inter-distance matrix),`laplacian`

(compute the normalized Laplacian),`laplacianEigenMaps`

(get the eigen maps).

By default all computations above are done with `Float32`

precision. Another real type can be requested using method (2), where the `type`

argument is defined.

Return the 4-tuple `(Λ, U, iterations, convergence)`

, where:

- $Λ$ is a $q⋅q$ diagonal matrix holding on diagonal the eigenvalues corresponding to the $q$ dimensions of the Laplacian eigen maps,
- $U$ holds in columns the $q$ eigenvectors holding the $q$ coordinates of the points in the embedding space,
- $iterations$ is the number of iterations executed by the power method,
- $convergence$ is the convergence attained by the power method.

**Arguments**:

`metric`

is the metric of type Metric::Enumerated type used for computing the inter-distances,- $𝐏$ is a 1d array of $k$ positive matrices of ℍVector type,
- $q$ is the dimension of the Laplacian eigen maps,
- $epsilon$ is the bandwidth of the Laplacian (see
`laplacian`

); - The following
*<optional keyword argument>*applyies for computing the inter-distances:- if
`⏩=true`

(default) the computation of inter-distances is multi-threaded.

- if
- The following
*<optional keyword argument>*applyies to the computation of the Laplacian by the`laplacian`

function:- if
`densityInvariant=true`

the density-invariant Laplacian is computed (see`laplacian`

).

- if
- The following are
*<optional keyword arguments>*for the power method iterative algorithm invoked by`laplacianEigenMaps`

:`tol`

is the tolerance for convergence of the power method (see below),`maxiter`

is the maximum number of iterations allowed for the power method,- if
`verbose=true`

the convergence at all iterations will be printed;

$tol$ defaults to the square root of `Base.eps`

of the `Float32`

type (1) or of the `type`

passed as argumant (2). This corresponds to requiring equality for the convergence criterion over two successive power iterations of about half of the significant digits.

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

**See also**: `distanceSqrMat`

, `laplacian`

, `laplacianEigenMaps`

.

**Examples**

```
using PosDefManifold
# Generate a set of k random 10x10 SPD matrices
k=10
Pset=randP(10, k)
evalues, maps, iter, conv=spectralEmbedding(Fisher, Pset, 2)
# show convergence information
evalues, maps, iter, conv=spectralEmbedding(Fisher, Pset, 2; verbose=true)
# use Float64 precision.
evalues, maps, iter, conv=spectralEmbedding(Float64, Fisher, Pset, 2)
using Plots
# check eigevalues and eigenvectors
plot(diag(evalues))
plot(maps[:, 1])
plot!(maps[:, 2])
plot!(maps[:, 3])
# plot the data in the embedded space
plot(maps[:, 1], maps[:, 2], seriestype=:scatter, title="Spectral Embedding", label="Pset")
# try a different value of epsilon
evalues, maps, iter, conv=spEmb(Fisher, Pset, k-1, 0.01; maxiter=1000)
plot(maps[:, 1], maps[:, 2], seriestype=:scatter, title="Spectral Embedding", label="Pset")
# see the example in `Laplacian` function for more on this
```

## Means

Function | Description |
---|---|

`mean` | Weighted Fréchet mean (wFm) of a scalar or matrix set using any metric |

`means` | As above for several sets at once |

`generalizedMean` | Generalized wFm of a matrix set |

`geometricMean` , `gMean` | wFm of a matrix set minimizing the dispersion according to the Fisher metric (iterative) |

`geometricpMean` , `gpMean` | robust wFm of a matrix set minimizing the p-dispersion according to the Fisher metric (iterative) |

`logdet0Mean` , `ld0Mean` | wFm of a matrix set according to the logdet0 metric (iterative) |

`wasMean` | wFm of a matrix set according to the Wasserstein metric (iterative) |

`powerMean` | Power wFm of a matrix set (iterative) |

`inductiveMean` , `indMean` | Recursive Fréchet mean of a matrix set (constructive) |

`midrange` | Geometric midrange of two matrices |

⋅

`Statistics.mean`

— Function```
(1) mean(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(2) mean(metric::Metric, D::𝔻{T}, E::𝔻{T}) where T<:Real
(3) mean(metric::Metric, 𝐏::ℍVector;
<
w::Vector=[],
✓w=true,
init::Union{ℍ, Nothing}=nothing,
tol::Real=0.,
verbose=false,
⏩=true >)
(4) mean(metric::Metric, 𝐃::𝔻Vector;
< same optional keyword arguments as in (3) >)
```

(1) Mean of two positive definite matrices, passed in arbitrary order as arguments $P$ and $Q$, using the specified `metric`

of type Metric::Enumerated type. The order is arbitrary as all metrics implemented in **PosDefManifold** are symmetric. This is the midpoint of the geodesic. For the weighted mean of two positive definite matrices use instead the `geodesic`

function. $P$ and $Q$ must be flagged as `Hermitian`

. See typecasting matrices.

(2) Like in (1), but for two real diagonal positive definite matrices $D$ and $E$.

(3) Fréchet mean of an 1d array $𝐏$ of $k$ positive definite matrices $𝐏={P_1,...,P_k}$ of ℍVector type, with optional non-negative real weights $w={w_1,...,w_k}$ and using the specified `metric`

as in (1).

(4) Fréchet mean of an 1d array $𝐃$ of $k$ positive definite matrices $𝐃={D_1,...,D_k}$ of 𝔻Vector type, with optional non-negative real weights $w={w_1,...,w_k}$ and using the specified `metric`

as in (1).

If you don't pass a weight vector with *<optional keyword argument>* $w$, return the *unweighted mean*.

If *<optional keyword argument>* `✓w=true`

(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

Adopting the `Fisher`

, `logdet0`

and `Wasserstein`

metric in (3) and the `logdet0`

metric in (4), the mean is computed by means of an iterative algorithm. A particular initialization for these algorithms can be provided passing an Hermitian matrix as *<optional keyword argument>* `init`

. The convergence for these algorithm is required with a tolerance given by *<optional keyword argument>* `tol`

. if `verbose=true`

the covergence attained at each iteration is printed. Other information such as if the algorithm has diverged is also printed. For more options in computing these means call directly functions `geometricMean`

, `logdet0Mean`

and `wasMean`

, which are called hereby. For the meaning of the `tol`

default value see the documentation of these functions. See also the robust mean function `geometricpMean`

, which cannot be called from here. Notice that arguments `init`

and `tol`

have an effect only for the aferomentioned metrics in methods (3) and (4).

For (3) and (4), if `⏩=true`

(default), the computation of the mean is multi-threaded for all metrics.

**Math**

The Fréchet mean of a set of $k$ matrices ${P_1, P_2,..., P_k}$ weighted by ${w_1, w_2,..., w_k}:\sum_{i=1}^{k}w_i=1$ for the supported metrics are, for those with closed form expression:

Metric | weighted Fréchet mean |
---|---|

Euclidean | $\sum_{i=1}^{k}w_i P_i$ |

invEuclidean | $\big(\sum_{i=1}^{k}w_i P_i^{-1}\big)^{-1}$ |

ChoEuclidean | $TT^*$, where $T=bL_P + aL_Q$ |

logEuclidean | $\textrm{exp}\big(\sum_{i=1}^{k}w_i\hspace{1pt} \textrm{log}P_i \big)$ |

logCholesky | $TT^*$, where $T=\sum_{i=1}^{k}(w_kS_k)+\sum_{i=1}^{k}(w_k\textrm{log}D_k)$ |

Jeffrey | $A^{1/2}\big(A^{-1/2}HA^{-1/2}\big)^{1/2}A^{1/2}$ |

and for those that are found by an iterative algorithm and that verify an equation:

Metric | equation verified by the weighted Fréchet mean |
---|---|

Fisher | $\sum_{i=1}^{k}w_i\textrm{log}\big(G^{-1/2} P_k G^{-1/2}\big)=0.$ |

logdet0 | $\sum_{i=1}^{k}w_i\big(\frac{1}{2}P_i+\frac{1}{2}G\big)^{-1}=G^{-1}$ |

VonNeumann | N.A. |

Wasserstein | $G=\sum_{i=1}^{k}w_i\big( G^{1/2} P_i G^{1/2}\big)^{1/2}$ |

**legend:** $L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$). $A$ and $H$ are the weighted arithmetic and weighted harmonic mean, respectively.

**See**: geodesic, mean, Fréchet mean.

**Examples**

```
using LinearAlgebra, Statistics, PosDefManifold
# Generate 2 random 3x3 SPD matrices
P=randP(3)
Q=randP(3)
M=mean(logdet0, P, Q) # (1)
M=mean(Euclidean, P, Q) # (1)
# passing several matrices and associated weights listing them
# weights vector, does not need to be normalized
R=randP(3)
mean(Fisher, ℍVector([P, Q, R]); w=[1, 2, 3])
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4)
weights=[1, 2, 3, 1]
# passing a vector of Hermitian matrices (ℍVector type)
M=mean(Euclidean, Pset; w=weights) # (2) weighted Euclidean mean
M=mean(Wasserstein, Pset) # (2) unweighted Wassertein mean
# display convergence information when using an iterative algorithm
M=mean(Fisher, Pset; verbose=true)
# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 160)
@benchmark(mean(logEuclidean, Pset; ⏩=false)) # single-threaded
@benchmark(mean(logEuclidean, Pset)) # multi-threaded
```

`mean(metric::Metric, ν::Vector{T}) where T<:RealOrComplex`

Mean of $k$ real or complex scalars, using the specified `metric`

of type Metric::Enumerated type. Note that using the Fisher, logEuclidean and Jeffrey metric, the resulting mean is the scalar geometric mean. Note also that the code of this method is in unit *statistics.jl*, while the code for all the others is in unit *riemannianGeometry.jl*.

**Examples**

```
using PosDefManifold
# Generate 10 random numbers distributed as a chi-square with 2 df.
ν=[randχ²(2) for i=1:10]
arithmetic_mean=mean(Euclidean, ν)
geometric_mean=mean(Fisher, ν)
harmonic_mean=mean(invEuclidean, ν)
harmonic_mean<=geometric_mean<=arithmetic_mean # AGH inequality
```

`PosDefManifold.means`

— Function```
(1) means(metric::Metric, 𝒫::ℍVector₂;
<⏩=true>)
(2) means(metric::Metric, 𝒟::𝔻Vector₂;
<⏩=true>)
```

(1) Given a 2d array $𝒫$ of positive definite matrices as an ℍVector₂ type compute the Fréchet mean for as many ℍVector type objects as hold in $𝒫$, using the specified `metric`

of type Metric::Enumerated type. Return the means in a vector of `Hermitian`

matrices, that is, as an `ℍVector`

type.

(2) Given a 2d array $𝒟$ of real positive definite matrices as an 𝔻Vector₂ type compute the Fréchet mean for as many 𝔻Vector type objects as hold in $𝒟$, using the specified `metric`

of type Metric::Enumerated type. Return the means in a vector of `Diagonal`

matrices, that is, as a `𝔻Vector`

type.

The weigted Fréchet mean is not supported in this function.

If `⏩=true`

(default) the computation of the means is multi-threaded.

**See also**: `mean`

.

**Examples**

```
using PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# Generate a set of 40 random 4x4 SPD matrices
Qset=randP(3, 40) # or, using unicode: 𝐐=randP(3, 40)
# listing directly ℍVector objects
means(logEuclidean, ℍVector₂([Pset, Qset])) # or: means(logEuclidean, ℍVector₂([𝐏, 𝐐]))
# note that [𝐏, 𝐐] is actually a ℍVector₂ type object
# creating and passing an object of ℍVector₂ type
sets=ℍVector₂(undef, 2) # or: 𝒫=ℍVector₂(undef, 2)
sets[1]=Pset # or: 𝒫[1]=𝐏
sets[2]=Qset # or: 𝒫[2]=𝐐
means(logEuclidean, sets) # or: means(logEuclidean, 𝒫)
# going multi-threated
# first, create 20 sets of 200 50x50 SPD matrices
sets=ℍVector₂([randP(50, 200) for i=1:20])
# How much computing time we save ?
# (example min time obtained with 4 threads & 4 BLAS threads)
using BenchmarkTools
# non multi-threaded, mean with closed-form solution
@benchmark(means(logEuclidean, sets; ⏩=false)) # (6.196 s)
# multi-threaded, mean with closed-form solution
@benchmark(means(logEuclidean, sets)) # (1.897 s)
sets=ℍVector₂([randP(10, 200) for i=1:10])
# non multi-threaded, mean with iterative solution
# wait a bit
@benchmark(means(Fisher, sets; ⏩=false)) # (4.672 s )
# multi-threaded, mean with iterative solution
@benchmark(means(Fisher, sets)) # (1.510 s)
```

`PosDefManifold.generalizedMean`

— Function```
generalizedMean(𝐏::Union{ℍVector, 𝔻Vector}, p::Real;
<
w::Vector=[],
✓w=true,
⏩=true >)
```

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the *weighted generalized means* $G$ with real parameter $p$, that is,

$G=\big(\sum_{i=1}^{k}w_iP_i^p\big)^{1/p}$.

If you don't pass a weight vector with *<optional keyword argument>* $w$, return the *unweighted generalized mean*

$G=\big(\sum_{i=1}^{k}P_i^p\big)^{1/p}$.

If *<optional keyword argument>* `✓w=true`

(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the weights each time.

If *<optional key argmuent>* ⏩=true the computation of the generalized mean is multi-threaded.

The following special cases for parameter $p$ are noteworthy:

- For $p=\frac{1}{2}$ the generalized mean is the modified Bhattacharyya mean.
- For $p=1$ the generalized mean is the Euclidean mean.
- For $p=-1$ the generalized mean is the inverse Euclidean mean.
- For (the limit of) $p=0$ the generalized mean is the log Euclidean mean, which is the Fisher mean when matrices in 𝐏 all pair-wise commute.

Notice that when matrices in 𝐏 all pair-wise commute, for instance if the matrices are diagonal, the generalized means coincide with the power means for any $p∈[-1, 1]$ and for $p=0.5$ it coincides also with the Wasserstein mean. For this reason the generalized means are used as default initialization of both the `powerMean`

and `wasMean`

algorithm.

**See**: generalized means.

**See also**: `powerMean`

, `wasMean`

, `mean`

.

**Examples**

```
using LinearAlgebra, Statistics, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# unweighted mean
G = generalizedMean(Pset, 0.25) # or: G = generalizedMean(𝐏, 0.25)
# weighted mean
G = generalizedMean(Pset, 0.5; w=weights)
# with weights previously normalized we can set ✓w=false
weights=weights./sum(weights)
G = generalizedMean(Pset, 0.5; w=weights, ✓w=false)
# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 160)
@benchmark(generalizedMean(Pset; ⏩=false)) # single-threaded
@benchmark(generalizedMean(Pset)) # multi-threaded
```

`PosDefManifold.geometricMean`

— Function```
geometricMean(𝐏::Union{ℍVector, 𝔻Vector};
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0.,
maxiter::Int=500,
adaptStepSize::Bool=true,
verbose=false,
⏩=true >)
```

**alias**: `gmean`

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the Fisher metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying

$\sum_{i=1}^{k}w_i\textrm{log}\big(G^{-1/2} P_i G^{-1/2}\big)=0.$

For estimating it, this function implements the well-known gradient descent algorithm, but with an exponential decaying step size $ς$, yielding iterations

$G ←G^{1/2}\textrm{exp}\big(ς\sum_{i=1}^{k}w_i\textrm{log}(G^{-1/2} P_i G^{-1/2})\big)G^{1/2}.$

If you don't pass a weight vector with *<optional keyword argument>* $w$, return the *unweighted geometric mean*.

If *<optional keyword argument>* `✓w=true`

(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more *<optional keyword arguments*>:

`init`

is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,`tol`

is the tolerance for the convergence (see below).`maxiter`

is the maximum number of iterations allowed- if
`verbose`

=true, the convergence attained at each iteration and the step size $ς$ is printed. Also, a*warning*is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded (see below).
- if
`adaptStepSize`

=false the step size`ς`

is fixed to 1 at all iterations.

If the input is a 1d array of $k$ real positive definite diagonal matrices the solution is available in closed-form as the log Euclidean mean, hence the *<optional keyword arguments*> `init`

, `tol`

and `verbose`

have no effect and return the 3-tuple $(G, 1, 0)$. See the log Euclidean metric.

In normal circumstances this algorithm converges monothonically. If the algorithm diverges and `verbose`

is true a **warning** is printed indicating the iteration when this happened.

The exponential decaying step size features a faster convergence rate as compared to the fixed step size $ς=1$ that is usually adopted. The decaying rate is inversely proportional to `maxiter`

, thus, increase/decrease `maxiter`

in order to set a slower/faster decaying rate. `maxiter`

should not be set too low though.

$tol$ defaults to the square root of `Base.eps`

of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the satisfying matrix equation divided by the number of elements to vanish for about half the significant digits.

**See**: Fisher metric.

**See also**: `geometricpMean`

, `powerMean`

, `wasMean`

, `logdet0Mean`

, `mean`

.

**Examples**

```
using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# unweighted mean
G, iter, conv = geometricMean(Pset) # or G, iter, conv = geometricMean(𝐏)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# weighted mean
G, iter, conv = geometricMean(Pset, w=weights)
# print the convergence at all iterations
G, iter, conv = geometricMean(Pset; verbose=true)
# now suppose Pset has changed a bit, initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = geometricMean(Pset; w=weights, ✓w=true, verbose=true, init=G)
# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(geometricMean(Pset; ⏩=false)) # single-threaded
@benchmark(geometricMean(Pset)) # multi-threaded
# show the mean and the input points using spectral embedding
using Plots
k=80
Pset=randP(20, k)
G, iter, conv = geometricMean(Pset)
push!(Pset, G)
Λ, U, iter, conv=spectralEmbedding(Fisher, Pset, 2; verbose=true)
plot(U[1:k, 1], U[1:k, 2], seriestype=:scatter, title="Spectral Embedding", label="Pset")
plot!(U[k+1:k+1, 1], U[k+1:k+1, 2], seriestype=:scatter, label="mean")
```

`PosDefManifold.geometricpMean`

— Function```
geometricpMean(𝐏::ℍVector, p::Real=0.5;
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0.,
maxiter::Int=500,
adaptStepSize=true,
verbose=false,
⏩=true >)
```

**alias**: `gpmean`

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type, a real parameter $0<p<=1$ and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the *p-mean*, i.e., the mean according to the Fisher metric minimizing the *p-dispersion* (see below) and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm.

This function implements the p-dispersion gradient descent algorithm with step-size $ς$ (to be published), yielding iterations

$G ←G^{1/2}\textrm{exp}\big(ς\sum_{i=1}^{k}pδ^2(G, P_i)^{p-1}w_i\textrm{log}(G^{-1/2} P_i G^{-1/2})\big)G^{1/2}.$

- if $p=1$ this yields the geometric mean (implemented specifically in
`geometricMean`

). - if $p=0.5$ this yields the geometric median (default).

If you don't pass a weight vector with *<optional keyword argument>* $w$, return the *unweighted geometric-p mean*.

If *<optional keyword argument>* `✓w=true`

(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more *<optional keyword arguments*>:

`init`

is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,`tol`

is the tolerance for the convergence (see below).`maxiter`

is the maximum number of iterations allowed.- if
`adaptStepSize`

=true (default) the step size $ς$ for the gradient descent is adapted at each iteration (see below). - if
`verbose`

=true, the step-size and convergence attained at each iteration is printed. Also, a*warning*is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded (see below).

If the algorithm diverges and `verbose`

is true a **warning** is printed indicating the iteration when this happened. This algorithm may temporary diverge, still reach convergence. Overall, while all other iterative algorithms implemented in **PosDefMaifold** are very stable, this is not.

The smaller the parameter $p$ is, the slower and less likely the convergence is. If the algorithm does not converge, try increasing $p$, initializing the algorithm with the output of `geometricMean`

and/or eliminating the otliers from the input set $𝐏$.

If `adaptStepSize`

is true (default) the step-size $ς$ is adapted at each iteration, otherwise a fixed step size $ς=1$ is used. Adapting the step size in general hastens convergence and improves the convergence behavior.

$tol$ defaults to the square root of `Base.eps`

of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the satisfying matrix equation divided by the number of elements to vanish for about half the significant digits.

**See**: Fisher metric.

**See also**: `geometricMean`

, `powerMean`

, `wasMean`

, `logdet0Mean`

, `mean`

.

**Examples**

```
using LinearAlgebra, PosDefManifold, Plots
# This examples show that this algorithm is more robust to outliers
# as compared to the standard geometric mean algorithm
# Generate a set of 100 random 10x10 SPD matrices
Pset=randP(10, 100)
# Get the usual geometric mean for comparison
G, iter1, conv1 = geometricMean(Pset, verbose=true)
# change p to observe how the convergence behavior changes accordingly
# Get the median (default)
H, iter2, conv2 = geometricpMean(Pset, verbose=true)
# Get the p-mean for p=0.25
H, iter2, conv2 = geometricpMean(Pset, 0.25, verbose=true)
println(iter1, " ", iter2); println(conv1, " ", conv2)
# move the first matrix in Pset to possibly create an otlier
Pset[1]=geodesic(Fisher, G, Pset[1], 3)
G1, iter1, conv1 = geometricMean(Pset, verbose=true)
H1, iter2, conv2 = geometricpMean(Pset, 0.25, verbose=true)
println(iter1, " ", iter2); println(conv1, " ", conv2)
# collect the geometric and p-means, before and after the
# introduction of the outier in vector of Hermitian matrices `S`.
S=HermitianVector([G, G1, H, H1])
# check the interdistance matrix Δ² to verify that the geometric mean
# after the introduction of the outlier (``G1``) is farther away from
# the geometric mean as compared to how much ``H1`` is further away
# from ``H``, i.e., that element (4,3) is much smaller than element (2,1).
Δ²=distance²Mat(Float64, Fisher, S)
# how far are all these matrices from all the others?
fullΔ²=Hermitian(Δ², :L)
dist=[sum(fullΔ²[:, i]) for i=1:size(fullΔ², 1)]
# plot the matrices in `S` using spectral embedding.
using Plots
Λ, U, iter, conv = laplacianEM(laplacian(Δ²), 3; verbose=true)
plot([U[1, 1]], [U[1, 2]], seriestype=:scatter, label="g-mean")
plot!([U[2, 1]], [U[2, 2]], seriestype=:scatter, label="g-mean outlier")
plot!([U[3, 1]], [U[3, 2]], seriestype=:scatter, label="p-mean")
plot!([U[4, 1]], [U[4, 2]], seriestype=:scatter, label="p-mean outlier")
# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(geometricpMean(Pset; ⏩=true)) # single-threaded
@benchmark(geometricpMean(Pset)) # multi-threaded
```

`PosDefManifold.logdet0Mean`

— Function```
logdet0Mean(𝐏::Union{ℍVector, 𝔻Vector};
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0.,
maxiter::Int=500,
verbose=false,
⏩=true >)
```

**alias**: `ld0Mean`

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the logdet zero metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying

$\sum_{i=1}^{k}w_i\big(\frac{1}{2}P_i+\frac{1}{2}G\big)^{-1}-G^{-1}=0$.

For estimating it, this function implements the fixed-point iteration algorithm suggested by (Moakher, 2012, p315)🎓, yielding iterations

$G ← \frac{1}{2}\big(\sum_{i=1}^{k}w_i(P_i+G)^{-1}\big)^{-1}$.

If you don't pass a weight vector with *<optional keyword argument>* $w$, return the *unweighted logdet zero mean*.

*<optional keyword argument>* `✓w=true`

(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more *<optional keyword arguments*>:

`init`

is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,`tol`

is the tolerance for the convergence (see below).`maxiter`

is the maximum number of iterations allowed.- if
`verbose`

=true, the convergence attained at each iteration is printed and a*warning*is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded (see below).

In normal circumstances this algorithm converges monothonically. If the algorithm diverges and `verbose`

is true a **warning** is printed indicating the iteration when this happened.

$tol$ defaults to 100 times the square root of `Base.eps`

of the nearest real type of data input $𝐏$. This corresponds to requiring the square root of the relative convergence criterion over two successive iterations to vanish for about half the significant digits minus 2.

**See**: logdet zero metric, modified Bhattacharyya mean.

**See also**: `powerMean`

, `wasMean`

, `logdet0Mean`

, `mean`

.

**Examples**

```
using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# unweighted mean
G, iter, conv = logdet0Mean(Pset) # or G, iter, conv = logdet0Mean(𝐏)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# weighted mean
G, iter, conv = logdet0Mean(Pset, w=weights)
# print the convergence at all iterations
G, iter, conv = logdet0Mean(Pset; w=weights, verbose=true)
# suppose Pset has changed a bit; initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = logdet0Mean(Pset; w=weights, ✓w=false, verbose=true, init=G)
# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(logdet0Mean(Pset; ⏩=false)) # single-threaded
@benchmark(logdet0Mean(Pset)) # multi-threaded
```

`PosDefManifold.wasMean`

— Function```
wasMean(𝐏::Union{ℍVector, 𝔻Vector};
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0.,
maxiter::Int=500,
verbose=false,
⏩=true >)
```

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the Wasserstein metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying

$G=\sum_{i=1}^{k}w_i\big( G^{1/2} P_i G^{1/2}\big)^{1/2}$.

For estimating it, this function implements the fixed-point iterative algorithm proposed by (Álvarez-Esteban et *al.*, 2016)🎓:

$G ← G^{-1/2}\big(\sum_{i=1}^{k} w_i(G^{1/2}P_i G^{1/2})^{1/2}\big)^2 G^{-1/2}$.

If you don't pass a weight vector with *<optional keyword argument>* $w$, return the *unweighted Wassertein mean*.

If *<optional keyword argument>* `✓w=true`

(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and they should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more *<optional keyword arguments*>:

`init`

is a matrix to be used as initialization for the mean. If no matrix is provided, the instance of generalized means with $p=0.5$ will be used,`tol`

is the tolerance for the convergence (see below).`maxiter`

is the maximum number of iterations allowed.- if
`verbose`

=true, the convergence attained at each iteration is printed and a*warning*is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded (see below).

If the input is a 1d array of $k$ real positive definite diagonal matrices the solution is available in closed-form as the modified Bhattacharyya mean, hence the *<optional keyword arguments*> `init`

, `tol`

and `verbose`

have no effect and return the 3-tuple $(G, 1, 0)$. See modified Bhattacharyya mean.

In normal circumstances this algorithm converges monothonically. If the algorithm diverges and `verbose`

is true a **warning** is printed indicating the iteration when this happened.

$tol$ defaults to the square root of `Base.eps`

of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the satisfying matrix equation divided by the number of elements to vanish for about half the significant digits.

**See**: Wasserstein metric.

**See also**: `powerMean`

, `wasMean`

, `logdet0Mean`

, `mean`

.

**Examples**

```
using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# unweighted mean
G, iter, conv = wasMean(Pset) # or: G, iter, conv = wasMean(𝐏)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# weighted mean
G, iter, conv = wasMean(Pset; w=weights)
# print the convergence at all iterations
G, iter, conv = wasMean(Pset; w=weights, verbose=true)
# suppose 𝐏 has changed a bit; initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = wasMean(Pset; w=weights, verbose=true, init=G)
# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(wasMean(Pset; ⏩=false)) # single-threaded
@benchmark(wasMean(Pset)) # multi-threaded
```

`PosDefManifold.powerMean`

— Function```
powerMean(𝐏::Union{ℍVector, 𝔻Vector}, p::Real;
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0.,
maxiter::Int=500,
verbose=false,
⏩=true >)
```

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type, an optional non-negative real weights vector $w={w_1,...,w_k}$ and a real parameter `p`

$\in[-1, 1]$, return the 3-tuple $(G, iter, conv)$, where $G$ is Lim and Palfia (2012)'s power means of order $p$ and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm, respectively. Mean $G$ is the unique positive definite matrix satisfying

$G=\sum_{i=1}^{k}(w_iG\textrm{#}_pP_i)$,

where $G\textrm{#}_pP_i$ is the Fisher geodesic equation. In particular:

- with $p=-1$ this is the
*harmonic mean*(see the inverse Euclidean metric), - with $p=+1$ this is the
*arithmetic mean*(see the Euclidean metric), - at the limit of $p$ evaluated at zero from both side this is the
*geometric mean*(see Fisher metric).

For estimating power means for $p\in(-1, 1)$, this function implements the fixed-point iterative algorithm of (Congedo et *al.*, 2017b)🎓. For $p=0$ (geometric mean) this algorithm is run two times with a small positive and negative value of $p$ and the geometric mean of the two resulting means is returned, as suggested in (Congedo et *al.*, 2017b)🎓. This way of estimating the geometric mean of a set of matrices is faster as compared to the usual gradient descent algorithm.

If you don't pass a weight vector with *<optional keyword argument>* $w$, return the *unweighted power mean*.

*<optional keyword argument>* `✓w=true`

(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more *<optional keyword arguments*>:

`init`

is a matrix to be used as initialization for the mean. If no matrix is provided, the instance of generalized means with parameter $p$ will be used.`tol`

is the tolerance for the convergence (see below).`maxiter`

is the maximum number of iterations allowed.- if
`verbose`

=true, the convergence attained at each iteration is printed and a*warning*is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded.

If the input is a 1d array of $k$ real positive definite diagonal matrices the solution is available in closed-form as the generalized mean of order `p`

, hence the *<optional keyword arguments*> `init`

, `tol`

and `verbose`

have no effect and return the 3-tuple $(G, 1, 0)$. See generalized means.

`verbose`

is true a **warning** is printed indicating the iteration when this happened.

$tol$ defaults to the square root of `Base.eps`

of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the difference of the matrix solution over two successive iterations divided by the number of elements in the matrix to vanish for about half the significant digits.

(2) Like in (1), but for a 1d array $𝐃={D_1,...,D_k}$ of $k$ real positive definite diagonal matrices of 𝔻Vector type. In this case the solution is available in closed-form, hence the *<optional keyword arguments*> `init`

, `tol`

and `verbose`

have no effect and return the 3-tuple $(G, 1, 0)$. See generalized means.

**See**: power means, generalized means, modified Bhattacharyya mean.

**See also**: `generalizedMean`

, `wasMean`

, `logdet0Mean`

, `mean`

.

**Examples**

```
using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# unweighted mean
G, iter, conv = powerMean(Pset, 0.5) # or G, iter, conv = powerMean(𝐏, 0.5)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# weighted mean
G, iter, conv = powerMean(Pset, 0.5; w=weights)
# print the convergence at all iterations
G, iter, conv = powerMean(Pset, 0.5; w=weights, verbose=true)
# suppose 𝐏 has changed a bit; initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = powerMean(Pset, 0.5; w=weights, verbose=true, init=G)
# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(powerMean(Pset, 0.5; ⏩=false)) # single-threaded
@benchmark(powerMean(Pset, 0.5)) # multi-threaded
```

`PosDefManifold.inductiveMean`

— Function```
(1) inductiveMean(metric::Metric, 𝐏::ℍVector)
(2) inductiveMean(metric::Metric, 𝐏::ℍVector, q::Int, Q::ℍ)
```

**alias**: `indMean`

(1) Compute the Fréchet mean of 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type with a law of large number inductive procedure (Ho et *al.,* 2013; Lim and Palfia, 2019; Massart et *al.*, 2018)🎓, such as

$G_1=P_1,$

$G_i=γ(i^{-1}, G_{(i-1)}, P_i), i=2,...,k,$

where $γ(i^{-1}, G_{(i-1)}, P_i)$ is a step on the geodesic relying $G_{(i-1)}$ to $P_i$ with arclength $i^{-1}$ using the specified `metric`

, of type Metric::Enumerated type.

(2) Like (1), but for the set of matrices $𝐐 ∪ 𝐏$, where it is assumed knowledge only of the set $𝐏$, the mean of $𝐐$ (Hermitian matrix argument `Q`

) and the number of matrices in $𝐐$ (integer argument `q`

). This method can be used, for example, for updating a block on-line algorithm, where $𝐏$ is the incoming block, `Q`

the previous mean estimation and `q`

the cumulative number of matrices on which the mean has been computed on-line.

For Fréchet means that do not have a closed form expression, this procedure features a computational complexity amounting to less than two iterations of gradient descent or fixed-point algorithms. This comes at the price of an approximation. In fact, the solution is not invariant to permutations of the matrices in array 𝐏 and convergence to the Fréchet mean for finite samples is not ensured (see Lim and Palfia, 2019; Massart et *al.*, 2018)🎓.

Since the inductive mean uses the `geodesic`

function, it is not available for the Von Neumann metric.

**Examples**

```
# A set of 100 matrices for which we want to compute the mean
𝐏=randP(10, 100)
𝐏1=ℍVector(collect(𝐏[i] for i=1:50)) # first 50
𝐏2=ℍVector(collect(𝐏[i] for i=51:100)) # last 50
# inductive mean of the whole set 𝐏
G=inductiveMean(Fisher, 𝐏)
# mean using the usual gradient descent algorithm
H, iter, conv=geometricMean(𝐏)
# inductive mean of 𝐏 given only 𝐏2,
# the number of matrices in 𝐏1 and the mean of 𝐏1
G2=inductiveMean(Fisher, 𝐏2, length(𝐏1), mean(Fisher, 𝐏1))
# average error
norm(G-H)/(dim(G, 1)^2)
norm(G2-H)/(dim(G, 1)^2)
```

`PosDefManifold.midrange`

— Function`midrange(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex`

Midrange (average of extremal values) of positive definite matrices $P$ and $Q$. Only the Fisher metric is supported, allowing the so-called *geometric midrange*. This has been defined in Mostajeran et *al.* (2019) 🎓 as

$P * Q = \frac{1}{\sqrt{\lambda_(min)}+\sqrt{\lambda_(max)}}\Big(Q+\sqrt{\lambda_(min)*\lambda_(max)}P\Big)$,

where $\lambda_(min)$ and $\lambda_(max)$ are the extremal generalized eigenvalues of $P$ and $Q$.

**Examples**

```
P=randP(3)
Q=randP(3)
M=midrange(Fisher, P, Q)
```

## Tangent Space operations

Function | Description |
---|---|

`logMap` | Logarithmic map (from manifold to tangent space) |

`expMap` | Exponential map (from tangent space to manifold) |

`vecP` | vectorization of matrices in the tangent space |

`matP` | matrization of matrices in the tangent space (inverse of `vecp` ) |

`parallelTransport` , pt | Parallel transport of tangent vectors and matrices |

⋅

`PosDefManifold.logMap`

— Function```
(1) logMap(metric::Metric, P::ℍ{T}, G::ℍ{T})
(2) logMap(metric::Metric, 𝐏::ℍVector, G::ℍ{T})
for all the above: where T<:RealOrComplex
```

(1) *Logaritmic Map:* map a positive definite matrix $P$ from the SPD or Hermitian manifold into the tangent space at base-point $G$ using the Fisher metric.

$P$ and $G$ must be flagged as `Hermitian`

. See typecasting matrices.

The map is defined as

$Log_G(P)=S=G^{1/2}\textrm{log}\big(G^{-1/2}PG^{-1/2}\big)G^{1/2}$.

`metric`

is a metric of type Metric::Enumerated type.

The result is an `Hermitian`

matrix.

(2) *Logarithmic map* (1) at base-point $G$ at once for $k$ positive definite matrices in 1d array $𝐏={P_1,...,P_k}$ of ℍVector type.

The result is an `ℍVector`

.

Currently only the Fisher metric is supported for tangent space operations.

The inverse operation is `expMap`

.

**See also**: `vecP`

, `parallelTransport`

.

**Examples**

```
using PosDefManifold
(1)
P=randP(3)
Q=randP(3)
metric=Fisher
G=mean(metric, P, Q)
# projecting P at the base point given by the geometric mean of P and Q
S=logMap(metric, P, G)
(2)
Pset=randP(3, 4)
# projecting all matrices in Pset at the base point given by their geometric mean.
Sset=logMap(Fisher, Pset, mean(Fisher, Pset))
```

`PosDefManifold.expMap`

— Function```
(1) expMap(metric::Metric, S::ℍ{T}, G::ℍ{T})
(2) expMap(metric::Metric, 𝐒::ℍVector, G::ℍ{T})
for all the above: where T<:RealOrComplex
```

(1) *Exponential Map:* map a tangent vector (a matrix) $S$ from the tangent space at base-point $G$ into the SPD or Hermitian manifold (using the Fisher metric).

$S$ and $G$ must be flagged as `Hermitian`

. See typecasting matrices.

The map is defined as

$Exp_G(S)=P=G^{1/2}\textrm{exp}\big(G^{-1/2}SG^{-1/2}\big)G^{1/2}$.

`metric`

is a metric of type Metric::Enumerated type.

The result is an `Hermitian`

matrix.

(2) *Exponential map* (1) at base-point $G$ at once for $k$ tangent vectors (matrices) in 1d array $𝐒={S_1,...,S_k}$ of ℍVector type.

The result is an `ℍVector`

.

Currently only the Fisher metric is supported for tangent space operations.

The inverse operation is `logMap`

.

**Examples**

```
(1)
using PosDefManifold, LinearAlgebra
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# projecting P on the tangent space at the Fisher mean base point G
S=logMap(Fisher, P, G)
# projecting back onto the manifold
P2=expMap(Fisher, S, G)
(2)
Pset=randP(3, 4)
# projecting all matrices in Pset at the base point given by their geometric mean.
G=mean(Fisher, Pset)
Sset=logMap(Fisher, Pset, G)
# projecting back onto the manifold
Pset2=expMap(Fisher, Sset, G)
```

`PosDefManifold.vecP`

— Function```
vecP(S::Union{ℍ{T}, Symmetric{R}};
range::UnitRange=1:size(S, 2)) where T<:RealOrComplex where R<:Real =
```

*Vectorize* a tangent vector (which is an `Hermitian`

or `Symmetric`

matrix) $S$: mat ↦ vec.

It gives weight $1$ to diagonal elements and $√2$ to off-diagonal elements so as to preserve the norm (Barachant et *al.*, 201E)🎓, such as

$∥S∥_F=∥vecP(S)∥_F$.

The result is a vector holding $n(n+1)/2$ elements, where $n$ is the size of $S$.

$S$ must be flagged as `Hermitian`

or `Symmetric`

. See typecasting matrices.

The reverse operation is provided by `matP`

, which always return an `Hermitian`

matrix.

If an optional keyword argument `range`

is provided, the vectorization concerns only the rows (or columns, since the input matrix is symmetric or Hermitian) in the range. Note that in this case the operation cannot be reverted by the `matP`

, that is, in this case the matrix is 'stuck' in the tangent space.

**Examples**

```
using PosDefManifold
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# projecting P at the base point given by the geometric mean of P and Q
S=logMap(Fisher, P, G)
# vectorize S
v=vecP(S)
# vectorize onlt the first two columns of S
v=vecP(S; range=1:2)
```

`PosDefManifold.matP`

— Function`matP(ς::Vector{T}) where T<:RealOrComplex`

*Matrizize* a tangent vector (vector) ς : vec -> mat.

This is the function reversing the `vecP`

function, thus the weighting applied therein is reversed as well.

If $ς=vecP(S)$ and $S$ is a $n⋅n$ Hermitian or Symmetric matrix, $ς$ is a tangent vector of size $n(n+1)/2$. The result of calling `matP(ς)`

is then $n⋅n$ matrix $S$. $S$ is always returned flagged as `Hermitian`

.

**To Do**: This function may be rewritten more efficiently.

**Examples**

```
using PosDefManifold
P=randP(3)
Q=randP(3)
G=mean(Fishr, P, Q)
# projecting P at onto the tangent space at the Fisher mean base point
S=logMap(Fisher, P, G)
# vectorize S
v=vecP(S)
# Rotate the vector by an orthogonal matrix
n=Int(size(S, 1)*(size(S, 1)+1)/2)
U=randP(n)
z=U*v
# Get the point in the tangent space
S=matP(z)
```

`PosDefManifold.parallelTransport`

— Function```
(1) parallelTransport(S::ℍ{T}, P::ℍ{T}, Q::ℍ{T})
(2) parallelTransport(S::ℍ{T}, P::ℍ{T})
(3) parallelTransport(𝐒::ℍVector, P::ℍ{T}, Q::ℍ{T})
(4) parallelTransport(𝐒::ℍVector, P::ℍ{T})
for all the above: where T<:RealOrComplex
```

**alias**: `pt`

(1) *Parallel transport* of tangent vector $S$ (a matrix) lying on the tangent space at base-point $P$ to the tangent space at base-point $Q$.

$S$, $P$ and $Q$ must all be `Hermitian`

matrices. Return an `Hermitian`

matrix. The transport is defined as:

$∥_{(P→Q)}(S)=\big(QP^{-1}\big)^{1/2}S\big(QP^{-1}\big)^{H/2}$.

If $S$ is a positive definite matrix in the manifold (and not a tangent vector) it will be 'trasported' from $P$ to $Q$, amounting to (Yair et *al.*, 2019🎓)

- project $S$ onto the tangent space at base-point $P$,
- parallel transport it to the tangent space at base-point $Q$,
- project it back onto the manifold at base-point $Q$.

(2) *Parallel transport* as in (1), but to the tangent space at base-point the *identity matrix*.

The transport reduces in this case to:

$∥_{(P→I)}(S)=P^{-1/2}SP^{-1/2}$.

(3) *Parallel transport* as in (1) at once for $k$ tangent vectors (matrices) in 1d array $𝐒={S_1,...,S_k}$ of ℍVector type.

(4) *Parallel transport* as in (2) at once for $k$ tangent vectors (matrices) in 1d array $𝐒={S_1,...,S_k}$ of ℍVector type.

Currently only the Fisher metric is supported for parallel transport.

**See also**: `logMap`

, `expMap`

, `vecP`

, `matP`

.

**Examples**

```
using PosDefManifold
(1)
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# i. projecting P onto the tangent space at base-point G
S=logMap(Fisher, P, G)
# ii. parallel transport S to the tangent space at base-point Q
S_=parallelTransport(S, G, Q)
# iii. projecting back into the manifold at base-point Q
P_=expMap(Fisher, S_, Q)
# i., ii. and iii. can be done simply by
PP_=parallelTransport(P, G, Q)
# check
P_≈PP_ ? println(" ⭐ ") : println(" ⛔ ")
(2)
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# transport to the tangent space at base-point the identity
PP_=parallelTransport(P, G)
(3)
Pset=randP(3, 4)
Q=randP(3)
G=mean(Fisher, Pset)
# trasport at once all matrices in Pset
Pset2=parallelTransport(Pset, G, Q)
(4)
Pset=randP(3, 4)
G=mean(Fisher, Pset)
# recenter all matrices so to have mean=I
Pset2=parallelTransport(Pset, G)
# check
mean(Fisher, Pset2) ≈ I ? println(" ⭐ ") : println(" ⛔ ")
```

## Procrustes problems

Function | Description |
---|---|

`procrustes` | Solution to the Procrustes problem in the manifold of positive definite matrices |

⋅

`PosDefManifold.procrustes`

— Function`procrustes(P::ℍ{T}, Q::ℍ{T}, extremum="min") where T<:RealOrComplex`

Given two positive definite matrices $P$ and $Q$, return by default the solution of problem

$\textrm{argmin}_Uδ(P,U^HQU)$,

where $U$ varies over the set of unitary matrices and $δ(.,.)$ is a distance or divergence function.

$U^HQU$ is named in physics the *unitary orbit* of $Q$.

If the argument `extremum`

is passed as "max", it returns instead the solution of

$\textrm{argmax}_Uδ(P,U^HQU)$.

$P$ and $Q$ must be flagged as `Hermitian`

. See typecasting matrices.

As it has been shown in Bhatia and Congedo (2019)🎓, using each of the Fisher, logdet zero, Wasserstein and the Kullback-Leibler divergence (see logdet α), the best approximant to $P$ from the unitary orbit of $Q$ commutes with $P$ and, surprisingly, has the same closed-form expression, namely

$U_Q^↓U_P^{↓H}$ for the argmin and $U_Q^↑U_P^{↓H}$ for the argmax,

where $U^↓$ denotes the eigenvector matrix of the subscript argument with eigenvectors in columns sorted by *decreasing* order of corresponding eigenvalues and $U^↑$ denotes the eigenvector matrix of the subscript argument with eigenvectors in columns sorted by *increasing* order of corresponding eigenvalues.

The same solutions are known since a long time also by solving the extremal problem here above using the Euclidean metric (Umeyama, 1988).

The generalized Procrustes problem

$\textrm{argmin}_Usum_{i=1}^{k}δ(P_i,U^HQ_iU)$

can be solved using Julia package Manopt.

**Examples**

```
using PosDefManifold
P=randP(3)
Q=randP(3)
# argmin problem
U=procrustes(P, Q)
# argmax problem
V=procrustes(P, Q, "max")
```