stats_descriptive.jl
This unit implements descriptive statistics for cross-validation analysis.
Content
function | description |
---|---|
confusionMat | Confusion matrix given true and predicted labels |
predictAcc | Prediction accuracy given true and predicted labels or a confusion matrix |
predictErr | Prediction error given true and predicted labels or a confusion matrix |
binaryloss | Binary error loss |
See also stats_inferential.jl
PosDefManifoldML.confusionMat
— Functionfunction confusionMat(yTrue::IntVector, yPred::IntVector)
Return the confusion matrix expressing frequencies (counts), given integer vectors of true label yTrue
and predicted labels yPred
.
The length of yTrue
and yPred
must be equal. Furthermore, the yTrue
vector must comprise all natural numbers in between 1 and z, where z is the number of classes.
The confusion matrix will have size z. It is computed starting from a matrix filled everywhere with zeros and adding, for each label, 1 at entry [i
, j
] of the matrix, where i
is the true label and j
the predicted label. Thus, the first row will report the true labels for class 1, the second row the true labels for class 2, etc.
The returned matrix is a matrix of integers.
See predict
, predictAcc
, predictErr
Examples
using PosDefManifoldML
confusionMat([1, 1, 1, 2, 2], [1, 1, 1, 1, 2])
# return: [3 0; 1 1]
PosDefManifoldML.predictAcc
— Function(1)
function predictAcc(yTrue::IntVector, yPred::IntVector;
scoring :: Symbol = :b,
digits :: Int=3)
(2)
function predictAcc(CM::Union{Matrix{R}, Matrix{S}};
scoring :: Symbol = :b,
digits :: Int=3) where {R<:Real, S<:Int}
Return the prediction accuracy as a proportion, that is, ∈$[0, 1]$, given
- (1) the integer vectors of true labels
yTrue
and of predicted labelsyPred
, or - (2) a confusion matrix.
The confusion matrix may hold integers, in which case it is interpreted as expressing frequencies (counts) or may hold real numbers, in which case it is interpreted as expressing proportions.
If scoring
=:b (default) the balanced accuracy is computed. Any other value will make the function returning the regular accuracy. Balanced accuracy is to be preferred for unbalanced classes. For balanced classes the balanced accuracy reduces to the regular accuracy, therefore there is no point in using regular accuracy if not to avoid a few unnecessary computations when the class are balanced.
The error is rounded to the number of optional keyword argument digits
, 3 by default.
Maths
The regular accuracy is given by sum of the diagonal elements of the confusion matrix expressing proportions.
For the balanced accuracy, the diagonal elements of the confusion matrix are divided by the respective row sums and their mean is taken.
See predict
, predictErr
, confusionMat
Examples
using PosDefManifoldML
predictAcc([1, 1, 1, 2, 2], [1, 1, 1, 1, 2]; scoring=:a)
# regular accuracy, return: 0.8
predictAcc([1, 1, 1, 2, 2], [1, 1, 1, 1, 2])
# balanced accuracy, return: 0.75
PosDefManifoldML.predictErr
— Function(1)
function predictErr(yTrue::IntVector, yPred::IntVector;
scoring :: Symbol = :b,
digits :: Int=3)
(2)
function predictErr(CM::Union{Matrix{R}, Matrix{S}};
scoring :: Symbol = :b,
digits :: Int=3) where {R<:Real, S<:Int}
Return the complement of the predicted accuracy, that is, 1.0 minus the result of predictAcc
, given
- (1) the integer vectors of true labels
yTrue
and of predicted labelsyPred
, or - (2) a confusion matrix.
See predictAcc
, confusionMat
PosDefManifoldML.binaryloss
— Functionfunction binaryloss(yTrue::IntVector, yPred::IntVector)
Binary error loss given a vector of true labels yTrue
and a vector of predicted labels yPred
. These two vectors must have the same size. The error loss is 1 if the corresponding labels are different, zero otherwise. Return a BitVector, that is, a vector of booleans.
See predict
.
Examples
using PosDefManifoldML, Random
dummy1, dummy2, yTr, yPr=gen2ClassData(2, 10, 10, 10, 10, 0.1);
shuffle!(yPr)
[yTr yPr binaryloss(yTr, yPr)]