neurom.stats

Statistical analysis helper functions.

Nothing fancy. Just commonly used functions using scipy functionality.

Functions

compare_two

Compares two distributions of data.

fit

Calculate the parameters of a fit of a distribution to a data set.

fit_results_to_dict

Create a JSON-comparable dict from a FitResults object.

get_test

Returns the correct stat test.

optimal_distribution

Fit multiple distributions to a data set and return the fit with the minimal ks-distance.

scalar_stats

Calculate the stats from the given numpy functions.

total_score

Calculates the p-norm of the distances.

Classes

FitResults

StatTests

Enum representing valid statistical tests of scipy.

class neurom.stats.FitResults(params, errs, type)

Bases: tuple

class neurom.stats.StatTests(value)[source]

Bases: enum.Enum

Enum representing valid statistical tests of scipy.

neurom.stats.compare_two(data1, data2, test=<StatTests.ks: 1>)[source]

Compares two distributions of data.

And assess two scores: a distance between them and a probability they are drawn from the same distribution.

Parameters
  • data1 – numpy array of dataset 1

  • data2 – numpy array of dataset 2

  • test – Stat_tests Defines the statistical test to be used, based on the scipy available modules. Accepted tests: ks_2samp, wilcoxon, ttest

Returns

float High numbers define high dissimilarity between the two datasets p-value: float Small numbers define high probability the data come from same dataset.

Return type

dist

neurom.stats.fit(data, distribution='norm')[source]

Calculate the parameters of a fit of a distribution to a data set.

Parameters

data – array of data points to be fitted

Options:

distribution (str): type of distribution to fit. Default ‘norm’.

Returns

FitResults object with fitted parameters, errors and distribution type

Note

Uses Kolmogorov-Smirnov test to estimate distance and p-value.

neurom.stats.fit_results_to_dict(fit_results, min_bound=None, max_bound=None)[source]

Create a JSON-comparable dict from a FitResults object.

Parameters
  • fit_results (FitResults) – object containing fit parameters, errors and type

  • min_bound – optional min value to add to dictionary if min isn’t a fit parameter.

  • max_bound – optional max value to add to dictionary if max isn’t a fit parameter.

Returns

JSON-compatible dictionary with fit results

Note

Supported fit types: ‘norm’, ‘expon’, ‘uniform’

neurom.stats.get_test(stest)[source]

Returns the correct stat test.

neurom.stats.optimal_distribution(data, distr_to_check=('norm', 'expon', 'uniform'))[source]

Fit multiple distributions to a data set and return the fit with the minimal ks-distance.

Parameters

data – array of data points to be fitted

Options:

distr_to_check: tuple of distributions to be checked

Returns

FitResults object with fitted parameters, errors and distribution type of the fit with the smallest fit distance

Note

Uses Kolmogorov-Smirnov test to estimate distance and p-value.

neurom.stats.scalar_stats(data, functions=('min', 'max', 'mean', 'std'))[source]

Calculate the stats from the given numpy functions.

Parameters

data – array of data points to be used for the stats

Options:

functions: tuple of numpy stat functions to apply on data

Returns

Dictionary with the name of the function as key and the result as the respective value

neurom.stats.total_score(paired_dats, p=2, test=<StatTests.ks: 1>)[source]

Calculates the p-norm of the distances.

that have been calculated from the statistical test that has been applied on all the paired datasets.

Parameters

paired_dats – a list of tuples or where each tuple contains the paired data lists from two datasets

Options:

p : integer that defines the order of p-norm test: Stat_tests Defines the statistical test to be used, based on the scipy available modules. Accepted tests: ks_2samp, wilcoxon, ttest

Returns

A float corresponding to the p-norm of the distances that have been calculated. 0 corresponds to high similarity while 1 to low.