PermutationTests.jl

Installation

Execute the following command in Julia's REPL:

]add PermutationTests

Dependencies

standard Julia packages	external packages
Random	Combinatorics
Statistics	Folds
Test	Distributions

Quick start

Here are some quick examples to show how PermutationTests.jl works:

Univariate test for correlation

Given two vectors of $N$ observations, $x$ and $y$, test the null hypothesis

$H_0: r_{(x,y)}=0$,

where $r_{(x,y)}$ is the Pearson product-moment correlation between $x$ and $y$.

First, we chose a Type I error $α$ , typically fixed to Ronald Fisher's classical 0.05 level.

using PermutationTests
N=8 # number of observations
x, y = randn(N), randn(N) # some random Gaussian data for example
t = rTest(x, y)


⚁ Univariate Permutation test
p-value = 0.636
 ⚃      ⚅      ⚄      ⚅      ⚁      ⚁      ⚀      ⚁      ⚀      ⚃      
.p (p-value) .stat (test statistic).obsstat   (observed statistic)
.minp  (minimum attainable p-value).nperm (number of permutations)
.testtype  (exact or approximated).direction (Both, Left or Right)
.design  (Balanced or Unbalanced)

We reject the null hypothesis if the p-value is less then or equal to $α$, otherwise we suspend any judgement.

The result of the test is a structure, which fields are printed in yellow. For example:

t.p # p-value
t.nperm # number of permutations used by the test

Notice that the first time you run a test, the function will be compiled. From the second run on; it will go fast.

Multiple comparison test for correlation

Given a vector of $N$ observations $x$ and $M$ vectors of $N$ observations $y_m$, test simutaneously the $M$ null hypotheses

$H_0(m):r_{(x, y_m)}=0, m=1...M$,

where $r_{(x,y_m)}$ is the Pearson product-moment correlation between $x$ and the $m^{th}$ vector $y_m$.

First, we fix the family-wise error (FWE) rate we are willing to tolerate, say, at level 0.05.

using PermutationTests
N=8 # number of observations
M=1000 # number of hypotheses
x, Y = randn(N), [randn(N) for m=1:M] # random Gaussian data
t = rMcTest(x, Y) # by default the FWE is 0.05


⚄ ⚁ ⚄ ... Multiple Comparison Permutation test
Rejected 0 out of 1000 hypotheses (0.0 %) in 1 step with FWE=0.05
 ⚅      ⚁      ⚁      ⚂      ⚂      ⚃      ⚀      ⚀      ⚁      ⚅      
.p (p-values).stat (test statistic).obsstat  (observed statistics)
.minp  (minimum attainable p-value).nperm (number of permutations)
.testtype  (exact or approximated).direction (Both, Left or Right)
.design  (Balanced or Unbalanced)
.nulldistr   (null distribution).rejections (for each step down)
.stepdown    (true if stepdown).fwe (family-wise err. stepdown)

For each one of the $m$ hypotheses, we reject the null hypothesis if its p-value is smaller than the nominal level (0.05), otherwise we suspend any judgement.

The result of the test is a structure, which fields are printed in yellow. For example:

t.p # p-values
t.obsstat # observed test statistics

Preamble

If you have no idea what a statistical hypothesis test is, most probably you are on the wrong web page.

If you do, but you have no idea what a permutation test is, check first my introduction to permutation tests.

If you need help to establish what hypothesis test is appropriate for your data, check this page.

Permutation tests offer a different way to obtain the p-value usually obtained with well-known parametric tests, such as the t-test for independent samples, the ANOVA for repeated mesures, etc. In contrast to parametric tests, permutation tests may provide the exact p-value for the test, not an approximated value based on probability distribution theory, and make use of less stringent assumptions.

Moreover, using permutation tests it is possible to use any test-statistic, not just those with known distribution under the null hypothesis, as such distribution is obtained by data permutation.

Permutation tests have been introduced by none other than R.A. Fisher and E.J.G Pitman in the late '30 (see the references), but has become feasable only thanks to the advent of modern computers.

When multiple hypotheses are to be tested simultaneously, the multiple comparisons problem arises: statistical tests perforormed on each hypothesis separatedly cannot control anymore the probability to falsely reject the null hypothesis (Type I error). Using the extremum-statistic (also known as min-p) union-intersection test (see Winkler et al., 2016), permutation tests control the probabilty to commit one or more Type I error over the total number of hypotheses tested, regardless the configuration of true and false hypotheses, that is, they control the family-wise error (FWE) rate in the strong sense (see Westfall and Young, 1993).

While other multiple comparisons correction procedures controlling the FWE exists, such as the well-known Bonferroni or Holm-Bonferroni, they assume independence of the hypotheses. Instead, permutation tests do not. Actually, they naturally adapt to any degree and form of correlation among the hypotheses, thus they result more powerful when the hypotheses are not independent (see power).

Thanks to these characteristics, permutation tests conctitute the ideal framework for large-scale, possibly correlated, statistical hypothesis testing. This is the case, for instance of genomics (gene expression level) and neuroimaging (brain voxel activation), where up to hundreds of thousands of hypotheses, often largely correlated, must be tested simultaneusly.

Overview

PermutationTests.jl implements several univariate permutation tests and for all of them the corresponding multiple comparisons permutation tests based on the min-p union-intersection principle, with or without the step down procedure (Holmes et al., 1996; Westfall and Young, 1993).

For multiple comparisons tests, only tha case when all hypotheses are homogeneous is considered (same test statistic, same number of observations divided in the same number of groups/measurements).

Here is the list of available tests:

Univariate and Multiple Comparisons Permutation Tests
Pearson product-moment correlation
Trend correlation (fit of any kind of regression)
Point bi-serial correlation*
Student's t for independent samples
1-way ANOVA for independent samples
Χ² for 2xK contingency tables*
Fisher exact test* (2x2 contingency tables)
Student's t for repeated-measures
1-way ANOVA for repeated-measures
Cochran Q*
McNemar*
One-sample Student's t
Sign*
* for dicothomous data

You may also find useful the tests we have created as examples of how to create new tests.

When the number of permutations is high, PermutationTests.jl computes approximate (Monte Carlo) p-values. All test functions switches automatically from systematic to Monte Carlo permutations, but the user can force them to use either one permutation listing procedure.