Utils

This module contains various utility functions used elsewhere in the library.

binomial_ci(count, nobs, alpha=0.05)[source]

Confidence interval for binomial proportions using the normal approximation.

Parameters:

count (ndarray) – Number of successes of shape (X,).
nobs (ndarray) – Number of trials of shape (X,).
alpha (float) – Significance level. In range (0, 1)

Returns:

np.ndarray of shape (X, 2). Lower and upper limits of confidence interval with coverage 1-alpha.

bootstrap_ci(theta, theta_hat=None, alpha=0.05, *, method='quantile')[source]

Calculates the bootstrap confidence interval with approximate coverage 1-alpha for the empirical sample theta. We assume that we have computed N bootstrap estimates, theta, of the quantity of interest, theta_hat.

This function then constructs a confidence interval based on the bootstrap estimates, using the bias corrected and accelerated quantile method.

Parameters:

theta (ndarray) – Array of shape (N, Y), where N is the number of samples.
theta_hat (Optional[Union[float, ndarray]]) – Array of shape (Y,) with the empirical estimate of the metric. This is only needed for the methods “bc” and “bca”.
alpha (Optional[Union[float, ndarray]]) – Significance level. In range (0, 1). Vector-valued alpha only supported with “quantile” method.
method (str) –
Method to compute the CI from the bootstrap samples. Possible values are
- ”quantile” uses the alpha/2 and 1-alpha/2 quantiles of the empirical metric distribution.
- ”bc” applies bias correction to correct for the bias of the median of the empirical distribution
- ”bca” applies bias correction and acceleration to correct for non-constant standard error.
See Ch. 11 of Computer Age Statistical Inference by Efron and Hastie for details.

Returns:

Returns an array of shape (Y, 2) with lower and upper bounds of the CI. For vector-valued alpha of shape (Z,) the return values has shape (Y, Z, 2).

Return type:

ndarray

invert_pl_function(x, y, t)[source]

Inverts piecewise linear function.

The piecewise linear function \(f(x_i) = y_i\) is defined the by the sequence of points \((x_i, y_i)\). We assume that \(x\) is an increasing vector, i.e., \(x_i \leq x_{i+1}\); the vector \(x\) does not have to be strictly increasing, but if \(x_i = x_{i+1}\), then we assume that also \(y_i = y_{i+1}\).

For a given target value \(t_j\), the function finds all values \(s_{j, k}\), such that \(f(s_{j, k}) = t_j\). Because the number of solutions can vary for different \(t_j\), the return type for \(s\) is a list of the same length as \(t\) with each entry an array.

If the equation \(f(s) = t_j\) has no solution, we return the closest point \(s_j\), i.e.,

\[\|f(s_j) - t_j\| = \min_z \|f(z) - t_j\|\,.\]

We always return only one solution in this case.

Parameters:

x (ndarray) – Increasing vector of points defining the function.
y (ndarray) – Vector of same length as x defining the function values.
t (ndarray) – Vector of points at which to invert the function.

Returns:

A list \(s\) of the same length as \(t\) of arrays such that for all \(j\), the array \(s_j\) is strictly increasing and contains all solutions of the equation \(f(s) = t_j\).

Return type:

List[ndarray]