stats_descriptive.jl

This unit implements descriptive statistics for cross-validation analysis.

Content

functiondescription
confusionMatConfusion matrix given true and predicted labels
predictAccPrediction accuracy given true and predicted labels or a confusion matrix
predictErrPrediction error given true and predicted labels or a confusion matrix
binarylossBinary error loss

See also stats_inferential.jl

PosDefManifoldML.confusionMatFunction
function confusionMat(yTrue::IntVector, yPred::IntVector)

Return the confusion matrix expressing frequencies (counts), given integer vectors of true label yTrue and predicted labels yPred.

The length of yTrue and yPred must be equal. Furthermore, the yTrue vector must comprise all natural numbers in between 1 and z, where z is the number of classes.

The confusion matrix will have size z. It is computed starting from a matrix filled everywhere with zeros and adding, for each label, 1 at entry [i, j] of the matrix, where i is the true label and j the predicted label. Thus, the first row will report the true labels for class 1, the second row the true labels for class 2, etc.

The returned matrix is a matrix of integers.

See predict, predictAcc, predictErr

Examples

using PosDefManifoldML
confusionMat([1, 1, 1, 2, 2], [1, 1, 1, 1, 2])
# return: [3 0; 1 1]
source
PosDefManifoldML.predictAccFunction
(1)
function predictAcc(yTrue::IntVector, yPred::IntVector;
		scoring	:: Symbol = :b,
		digits	:: Int=3)

(2)
function predictAcc(CM::Union{Matrix{R}, Matrix{S}};
		scoring	:: Symbol = :b,
		digits	:: Int=3) where {R<:Real, S<:Int}

Return the prediction accuracy as a proportion, that is, ∈$[0, 1]$, given

  • (1) the integer vectors of true labels yTrue and of predicted labels yPred, or
  • (2) a confusion matrix.

The confusion matrix may hold integers, in which case it is interpreted as expressing frequencies (counts) or may hold real numbers, in which case it is interpreted as expressing proportions.

If scoring=:b (default) the balanced accuracy is computed. Any other value will make the function returning the regular accuracy. Balanced accuracy is to be preferred for unbalanced classes. For balanced classes the balanced accuracy reduces to the regular accuracy, therefore there is no point in using regular accuracy if not to avoid a few unnecessary computations when the class are balanced.

The error is rounded to the number of optional keyword argument digits, 3 by default.

Maths

The regular accuracy is given by sum of the diagonal elements of the confusion matrix expressing proportions.

For the balanced accuracy, the diagonal elements of the confusion matrix are divided by the respective row sums and their mean is taken.

See predict, predictErr, confusionMat

Examples

using PosDefManifoldML
predictAcc([1, 1, 1, 2, 2], [1, 1, 1, 1, 2]; scoring=:a)
# regular accuracy, return: 0.8
predictAcc([1, 1, 1, 2, 2], [1, 1, 1, 1, 2])
# balanced accuracy, return: 0.75
source
PosDefManifoldML.predictErrFunction
(1)
function predictErr(yTrue::IntVector, yPred::IntVector;
		scoring	:: Symbol = :b,
		digits	:: Int=3)
(2)
function predictErr(CM::Union{Matrix{R}, Matrix{S}};
		scoring	:: Symbol = :b,
		digits	:: Int=3) where {R<:Real, S<:Int}

Return the complement of the predicted accuracy, that is, 1.0 minus the result of predictAcc, given

  • (1) the integer vectors of true labels yTrue and of predicted labels yPred, or
  • (2) a confusion matrix.

See predictAcc, confusionMat

source
PosDefManifoldML.binarylossFunction
function binaryloss(yTrue::IntVector, yPred::IntVector)

Binary error loss given a vector of true labels yTrue and a vector of predicted labels yPred. These two vectors must have the same size. The error loss is 1 if the corresponding labels are different, zero otherwise. Return a BitVector, that is, a vector of booleans.

See predict.

Examples

using PosDefManifoldML, Random
dummy1, dummy2, yTr, yPr=gen2ClassData(2, 10, 10, 10, 10, 0.1);
shuffle!(yPr)
[yTr yPr binaryloss(yTr, yPr)]
source