conditioners.jl
This unit implements conditioners, also called pre-conditioners and pipelines, which are specified sequences of conditioners.
Pipelines are applied to the data (symmetric positive-definite matrices) in order to increase the classification performance and/or to reduce the computational complexity of classifiers.
Conditioners
The available conditioners are
Conditioner type | description |
---|---|
Tikhonov | Tikhonov regularization |
Recenter | Recentering aroung the identity matrix w/o dimensionality reduction |
Compress | Global scaling reducing the norm |
Equalize | Individual scaling (normalization) to reduce the norm |
Shrink | Move along the geodesic to reduce the norm |
In the table above, as elsewhere in this documentation, by norm of a matrix it is meant the distance of the matrix to the identity.
Pipelines
Pipelines are stored as a dedicated tuple type - see Pipeline
.
Content
function | description |
---|---|
@pipeline | Macro to create a pipeline |
fit! | Fit a pipeline and transform the data |
transform! | Transform the data using a fitted pipeline |
pickfirst | Return a copy of a specific conditioner in a pipeline |
includes | Check whether a conditioner type is in a pipeline |
dim | Dimension determined by a recentering pre-conditioner in a pipeline |
PosDefManifoldML.Tikhonov
— Typemutable struct Tikhonov <: Conditioner
α
threaded
Mutable structure for the Tikhonov regularization conditioner.
Given a set of points $𝐏$ in the manifold of positive-definite matrices, transform the set such as
$P_j+αI, \ j=1,...,k$,
where $I$ is the identity matrix and $α$ is a non-negative number.
This conditoner structure has two fields:
.α
, which is written in the structure when it is fitted to some data..threaded
, to determine if the transformation is done in multi-threading mode (true by default).
For constructing an instance, α
is an argument, while threaded
is a optional keyword argument.
The α
parameter must be given explicitly upon construction (it is zero by default).
Examples:
using PosDefManifoldML, PosDefManifold
# Create a conditioner
T = Tikhonov(0.001)
T = Tikhonov(0.001; threaded=false)
See also: fit!
, transform!
, crval
PosDefManifoldML.Recenter
— Typemutable struct Recenter <: Conditioner
metric
eVar
w
✓w
init
tol
verbose
forcediag
threaded
## Fitted parameters
Z
iZ
Mutable structure for the recentering conditioner.
Given a set of n·n
points $𝐏$ in the manifold of positive-definite matrices, transform the set such as
$ZP_jZ^T, \ j=1,...,k$,
where $Z$ is the whitening matrix of the barycenter of $𝐏$ as specified by the conditioner, i.e., if $G$ is the barycenter of $𝐏$, then $ZGZ^T=I$.
After recentering the barycenter becomes the identity matrix and the mean of the eigenvalues of the whitened matrices is 1. In the manifold of positive-definite matrices, recentering is equivalent to parallel transport of all points to the identity barycenter, according to a given metric.
Depending on the eVar
value used to define the Recenter
conditioner, matrices $Z$ may determine a dimensionality reduction of the input points as well. In this case $Z$ is not square, but a wide matrix of dimension $p·n$, with $p<n$.
This conditoner may behave in a supervised way; providing the class labels when it is fitted (see fit!
), the classes are equally weighted to compute the barycenter $G$, like tsWeights
`` does for computing the barycenter used for tangent space mapping. If the classes are balanced, the weighting has no effect.
This conditoner structure has the following fields:
.metric
, of type Metric, is to be specified by the user. It is the metric that will be adopted to compute the class means and the distances to the mean. default:PosDefManifold.Euclidean
..eVar
, the desired explained variance for the whitening. It can be a Real, Int ornothing
. See the documentation of method whitening in Diagonalizations.jl. It is 0.9999 by default.Fields
.w
,.✓w
,.init
and.tol
are passed to the mean method of PosDefManifold.jl for computing the barycenter $G$. Refer to the documentation therein for details..verbose
is a boolean determining if information is to be printed to the REPL when usingfit!
andtransform!
with this conditioner. It is false by default..forcediag
is a boolean for forcing diagonalization. It is true by default. If false, whitening is carried out only if a dimensionality reduction is needed, as determined byeVar
.If
.threaded
is true (default), all operations are multi-threaded.
For constructing an instance, metric
is an argument, while eVar
, w
, ✓w
, init
, tol
, verbose
, forcediag
and threaded
are optional keyword arguments.
Fitted parameters
When the conditioner is fitted, the following fields are written:
.Z
, the whitening matrix of the fitted set $P_j, \ j=1,...,k$, such that $ZP_jZ^T$ is whitened;.iZ
, the left inverse $Z^*$ of Z, such that $Z^*Z=I$ (identity matrix) if no dimensionality reduction is operated.
If dimensionality reduction is operated, $Z^*Z≠I$ has rank $p$.
Examples:
using PosDefManifoldML, PosDefManifold
# Create a default conditioner
R = Recenter(PosDefManifold.Euclidean)
# Since the Euclidean metric is the default metric,
# this is equivalent to
R = Recenter()
# Do not perform dimensionality reduction
R = Recenter(PosDefManifold.Fisher; eVar=nothing)
# Reduce the dimension to 10
R = Recenter(PosDefManifold.Fisher; eVar=10)
# Determine the dimension so as to explain at least 90% of the variance
R = Recenter(PosDefManifold.Fisher; eVar=0.9)
# Use class labels to balance the weights across classes
# (let `y` be a vector of int holding the class labels)
R = Recenter(PosDefManifold.Fisher; labels=y)
See also: fit!
, transform!
, crval
PosDefManifoldML.Compress
— Typemutable struct Compress <: Conditioner
threaded
β
end
Mutable structure for the compressing conditioner.
Given a set of points $𝐏$ in the manifold of positive-definite matrices, transform the set such as
$βP_j, \ j=1,...,k$,
where $β$ is chosen to minimize the average Euclidean norm of the transformed set, i.e., the average distance to the identity matrix according to the specifies metric.
Since the Euclidean norm is the Euclidean distance to the identity, compressing a recentered set of points minimizes the average dispersion of the set around the identity, thus it should be performed after conditioner Recenter
.
The structure has one field only:
.threaded
, determining whether the computations are multi-threaded (true by default).
For constructing an instance, only the threaded
optional keyword argument can be used.
Fitted parameters
When the conditioner is fitted, the following field is written:
.β
, a positive scalar minimizing the average Euclidean norm of the fitted set.
Examples:
using PosDefManifoldML, PosDefManifold
# Create the conditioner
C = Compress()
See also: fit!
, transform!
, crval
PosDefManifoldML.Equalize
— Typemutable struct Equalize <: Conditioner
threaded
β
end
Mutable structure for the equalizing conditioner.
Given a set of points $𝐏$ in the manifold of positive-definite matrices, transform the set such as
$β_jP_j, \ j=1,...,k$,
where the elements $β_j$ are chosen so as to minimize the Euclidean norm of the transformed matrices individually.
Since the Euclidean norm is the Euclidean distance to the identity, equalizing a recentered set of points minimizes the average dispersion of the set around the identity, thus it should be performed after conditioner Recenter
.
As compared to compression, equalization is more effective for reducing the distance to the identity, however it is not an isometry.
Also, in contrast to compression, the transformation of the matrices in set $𝐏$ is individual, so fitting equalization does not imply a learning process - see fit!
.
The structure has one field only:
.threaded
, determining whether the computations are multi-threaded (true by default).
For constructing an instance, only the threaded
optional keyword argument can be used.
Fitted parameters
When the conditioner is fitted, the following field is written:
.β
, a vector of positive scalars minimizing the Euclidean norm individually for each matrix in the fitted set.
Examples:
using PosDefManifoldML, PosDefManifold
# Create the conditioner
E = Equalize()
See also: fit!
, transform!
, crval
PosDefManifoldML.Shrink
— Typemutable struct Shrink <: Conditioner
metric
radius
refpoint
reshape
epsilon
verbose
threaded
## Fitted parameters
γ
m
sd
Mutable structure of the geodesic shrinking conditioner.
Given a set of points $𝐏$ in the manifold of positive-definite matrices, this conditioner moves all points towards the identity matrix $I$ along geodesics on the manifold defined in accordance to the specified metric
. This effectively defines a ball centered at $I$.
The step-size $γ$ of the geodesics from $I$ to each point $P$ in $𝐏$ is given by
$\gamma=\frac{r\sqrt{n}}{δ(P, I) + ϵ}$
where $r$ is the radius
argument, $n$ is the dimension of $P$, $δ(P, I)$ is the norm of $P$ according to the specified metric
and $ϵ$ is an optional small positive number given as argument epsilon
.
The conditioner has the following fields, which are also keyword arguments that can be passed upon construction:
.metric
, of type Metric, with default PosDefManifold.Euclidean
.
After shrinking, the set of points $𝐏$ acquires a sought .radius
, which is given as optional keyword argument to the constructor (default: 0.02). This is a measure of their acquired distances from the identity (norms), specifically, the maximum distance if .refpoint
=:max or the mean eccentricity if .refpoint
=:mean (default). In the first case the argument radius
defines a ball confining all points, with radius equal to the maximum distance from the identity of the transformed points + $ϵ$. In the second case, the actual radius of the ball is equal to
$\sqrt{\frac{1}{n}\sum_{j=1}^{k}δ(P_j, I) + ϵ}$.
.reshape
, a boolean for reshaping the eigenvalues of the set $𝐏$ after shrinking. It applies only to the Fisher (affine-invariant) metric. Default: false. See below.
.epsilon
, a non-negative real number, the $ϵ$ above. Default: 0.0.
.verbose
, a boolean. If true, information is printed in the REPL. Default: false
.threaded
, a boolean for using multi-threading. Default: true
For constructing an instance, metric
is an argument, while radius
, refpoint
, reshape
, epsilon
, verbose
and threaded
are optional keyword arguments.
Fitted parameters
When the conditioner is fitted, the following fields are written:
.γ
, the step-size for geodesics (according to metric
) from $I$ to the each matrix in $𝐏$.
.m
and .sd
, the mean and standard deviation of the eigenvalues of the set after shrinking. This is used for reshaping, which applies only if the Fisher metric is adopted. Reshaping is meaningful only if the input set has been recentered (see Recenter
). It recenters again the eigenvalues of the set after shrinking (mean = 1), and normalize them so as to have standard deviation equal to .radius
.
Examples:
using PosDefManifoldML, PosDefManifold
# Create a conditioner adopting the Fisher Metric and use reshaping
S = Shrink(PosDefManifold.Fisher; reshape = true)
See also: fit!
, transform!
, crval
PosDefManifoldML.Pipeline
— TypePipeline
is a type for tuples holding conditioners.
A pipeline holds a sequence of conditioners learned and (optionally) applied using fit!
. It can be subsequently applied on other data as it has been learnt using the transform!
function. All fit!
methods return a pipeline.
Pipelines comprising a single conditioner are allowed.
Pipelines can be saved to a file using the saveas
function and loaded from a file using the load
function.
Note that in Julia tuples are immutable, thus it is not possible to modify a pipeline. However it is possible to change the fields of the conditioners it holds.
In order to create a pipeline use the @pipeline
macro.
See also: fit!
, transform!
PosDefManifoldML.@pipeline
— Macromacro pipeline(args...)
Create a Pipeline
chaining the provided expressions.
As an example, the sintax is:
p = @pipeline Recenter() → Compress → Shrink(Fisher; threaded=false)
Note that:
The symbol → (escape "\to") separating the conditioners is optional.
This macro has alias
@→
.As in the example above, expressions may be instantiated conditioners, like
Recenter()
, or their type, likeCompress
, in which case the default conditioner of that type is created.
The example above is thus equivalent to
p = @→ Recenter() Compress() Shrink(Fisher; threaded=false)
Conditioners are not callable by the macro. Thus if you want to pass a variable, do not write
R = Recenter()
p = @→ R
but
R = Recenter()
p = @→ eval(R)
Available conditioners to form pipelines
Tikhonov
, Recenter
, Compress
, Equalize
, Shrink
See also: fit!
, transform!
Examples
using PosDefManifoldML, PosDefManifold
P=randP(3, 5)
pipeline = fit!(P, @→ Recenter → Compress)
StatsAPI.fit!
— Function function fit!(𝐏 :: ℍVector, pipeline :: Union{Pipeline, Conditioner};
transform = true,
labels = nothing)
Fit the given Pipeline
(or a single Conditioner
) to $𝐏$ and return a fitted Pipeline
object. $𝐏$ must be of the ℍVector type.
A single Conditioner
can be given as argument instead of a pipeline; a fitted pipeline with a single element will be returned. The type of the conditioner can be gives as well, in which case the default conditioner will be used - see examples below.
If pipeline
in an empty tuple, return an empty pipeline without doing anything.
if transform
is true (default), $𝐏$ is transformed (in-place), otherwise the pipeline is fitted but $𝐏$ is not transformed.
If labels
is a vector of integers holding the class labels of the points in $𝐏$, the conditioners are supervised (i.e., labels-aware), otherwise, if it is nothing
(default), they are unsupervised. Currently the only conditioners that can behave in a supervised manner is Recenter
. When supervised, the barycenter for recentering is computed given balanced weights to each class, like tsWeights
does for computing the barycenter used for tangent space mapping. If the classes are balanced, the weighting has no effect.
The returned pipeline can be used as argument for the transform!
function, ensuring that the fitted parameters are properly applied. It can also be saved to a file using the saveas
function and loaded from a file using the load
function.
Note that the pipeline given as argument is not modified.
Learning parameters during fit
For some of the conditioners there is no parameter to be learnt during training. For those, a call to the fit!
function is equivalent to a call to the transform!
function, with the exception that when the fit!
function is called the parameters used for the tranformation are stored in the returned pipeline.
See also: transform!
Examples
using PosDefManifoldML, PosDefManifold
## Example 1 (single conditioner):
# Generate some data
P=randP(3, 5) # 5 random 3x3 Hermitian matrices
Q=copy(P)
# Fit the default recentering conditioner (whitening)
pipeline = fit!(P, Recenter)
# This is equivalent to
pipeline = fit!(Q, Recenter())
pipeline[1].Z # a learnt parameter (whitening matrix)
## Example 2 (pipeline):
# Fit a pipeline comprising Tikhonov regularization,
# recentering, compressing and shrinking according to the Fisher metric.
# The matrices in P will be first regularized, then recentered,
# then compressed and finally shrunk.
P=randP(3, 5)
Q=copy(P)
pipeline = fit!(P,
@→ Tikhonov(0.0001) → Recenter → Compress → Shrink(Fisher; radius=0.01))
# or
pipeline = fit!(Q, @→ Recenter Compress Shrink(Fisher; radius=0.01))
# The whitening matrices of the the recentering conditioner,
pipeline[1].Z
# The scaling factors of the compressing conditioner,
pipeline[2].β
# and the step-size of the shrinking conditioner
pipeline[3]
## Example 3 (pipeline with a single conditioner):
P=randP(3, 5)
pipeline = fit!(P, @→ Recenter)
DataFrames.transform!
— Functionfunction transform!(𝐐 :: Union{ℍVector, ℍ}, pipeline :: Union{Pipeline, Conditioner})
Given a fitted Pipeline
(or a single Conditioner
), transform all matrices in $𝐐$ using the parameters learnt during the fitting process. Return $𝐐$.
In a training-test setting, a fitted conditioner or pipeline is given as argument to this function to make sure that the testing data is transformed according to the parameters learnt during the fitting of training data. More in general, this function can be used to transform in whatever way the data in $𝐐$.
If pipeline
in an empty tuple, return $𝐐$ without doing anything.
$𝐐$ can be a single Hermitian matrix or a vector of the ℍVector type. It is transformed in-place.
The dimension of matrix(ces) in $𝐐$ must be the same of the dimension of the matrices used to fit the conditioner or pipeline.
In contrast to the fit!
function, only instantiated conditioner can be used. For general use, this is transparent to the user as the fit!
function always returns pipelines with instantiated conditioners.
See: fit!
Examples
using PosDefManifoldML, PosDefManifold
## Example 1 (single conditioner)
# Generate some 'training' and 'testing' data
PTr=randP(3, 20) # 20 random 3x3 Hermitian matrices
PTe=randP(3, 5) # 5 random 3x3 Hermitian matrices
# Fit the default recentering conditioner (whitening)
# Matrices in PTr will be transformed (recentered)
R = fit!(PTr, Recenter())
# Transform PTe using recentering as above
# using the parameters for recentering learnt
# on PTr during the fitting process.
transform!(PTe, R)
mean(PTr)-I # Should be close to the zero matrix.
mean(PTe)-I # Should not be close to the zero matrix
# as the recentering parameter is learnt on PTr, not on PTe.
## Example 2 (pipeline)
# Generate some 'training' and 'testing' data
PTr=randP(3, 20) # 20 random 3x3 Hermitian matrices
PTe=randP(3, 5) # 5 random 3x3 Hermitian matrices
QTr=copy(PTr)
QTe=copy(PTe)
p = @→ Tikhonov(0.0002) Recenter(; eVar=0.99) Compress Shrink(Fisher; radius=0.01)
pipeline = fit!(QTr, p)
transform!(QTe, pipeline)
## Example 3 (pipeline with a single conditioner):
P=randP(3, 5)
# For the Equalize conditioner there is no need to fit some data
transform!(P, @→ Equalize)
# This gives an error as Recenter needs to learn parameters (use fit! instead):
transform!(P, @→ Recenter)
PosDefManifoldML.pickfirst
— Functionfunction pickfirst(pipeline, conditioner)
Return a copy of the first conditioner of the pipeline
which is of the same type as conditioner
. If no such conditioner is found, return nothing
. Note that a copy is returned, not the conditioner in the pipeline itself.
The provided conditioner
can be a type or an instance of a conditioner. The returned element will always be an instance, as pipelines holds instances only.
See: includes
Examples
using PosDefManifoldML
pipeline = @→ Recenter() Shrink()
S = pickfirst(pipeline, Shrink)
S isa Conditioner # true
S isa Shrink # true
# retrive a parameter of the conditioner
pickfirst(pipeline, Shrink).radius
PosDefManifoldML.includes
— Functionfunction includes(pipeline, conditioner)
Return true if the given Pipeline
includes a conditioner of the same type as conditioner
.
The provided conditioner
can be a type or an instance of a conditioner.
Examples
using PosDefManifoldML
pipeline= @→ Recenter() → Shrink()
includes(pipeline, Shrink) # true
includes(pipeline, Shrink()) # true
# same type, althoug a different instance
includes(pipeline, Shrink(Fisher; radius=0.1)) # true
includes(pipeline, Compress) # false
Learn the package: check out saveas
PosDefManifold.dim
— Functionfunction dim(pipeline::Pipeline)
Return the dimension determined by a fitted Recenter
pre-conditioner if the pipeline
comprises such a pre-conditioner, nothing
otherwise. This is used to adapt pipelines - see the documentation of the fit!
function for ENLR machine learning models for an example.
Examples
using PosDefManifoldML, PosDefManifold
pipeline = @→ Recenter(; eVar=0.9) → Shrink()
dim(pipeline) # return false, as it is not fitted
P = randP(10, 5)
p = fit!(P, pipeline)
dim(p) # return an integer ≤ 10
Learn the package: check out @pipeline