Utils
This module contains various utility functions used elsewhere in the library.
- binomial_ci(count, nobs, alpha=0.05)[source]
Confidence interval for binomial proportions using the normal approximation.
- Parameters:
count (ndarray) – Number of successes of shape (X,).
nobs (ndarray) – Number of trials of shape (X,).
alpha (float) – Significance level. In range (0, 1)
- Returns:
np.ndarray of shape (X, 2). Lower and upper limits of confidence interval with coverage 1-alpha.
- bootstrap_ci(theta, theta_hat=None, alpha=0.05, *, method='quantile')[source]
Calculates the bootstrap confidence interval with approximate coverage 1-alpha for the empirical sample theta. We assume that we have computed N bootstrap estimates, theta, of the quantity of interest, theta_hat.
This function then constructs a confidence interval based on the bootstrap estimates, using the bias corrected and accelerated quantile method.
- Parameters:
theta (ndarray) – Array of shape (N, Y), where N is the number of samples.
theta_hat (Optional[Union[float, ndarray]]) – Array of shape (Y,) with the empirical estimate of the metric. This is only needed for the methods “bc” and “bca”.
alpha (Optional[Union[float, ndarray]]) – Significance level. In range (0, 1). Vector-valued alpha only supported with “quantile” method.
method (str) –
Method to compute the CI from the bootstrap samples. Possible values are
”quantile” uses the alpha/2 and 1-alpha/2 quantiles of the empirical metric distribution.
”bc” applies bias correction to correct for the bias of the median of the empirical distribution
”bca” applies bias correction and acceleration to correct for non-constant standard error.
See Ch. 11 of Computer Age Statistical Inference by Efron and Hastie for details.
- Returns:
Returns an array of shape (Y, 2) with lower and upper bounds of the CI. For vector-valued alpha of shape (Z,) the return values has shape (Y, Z, 2).
- Return type:
ndarray
- invert_pl_function(x, y, t)[source]
Inverts piecewise linear function.
The piecewise linear function \(f(x_i) = y_i\) is defined the by the sequence of points \((x_i, y_i)\). We assume that \(x\) is an increasing vector, i.e., \(x_i \leq x_{i+1}\); the vector \(x\) does not have to be strictly increasing, but if \(x_i = x_{i+1}\), then we assume that also \(y_i = y_{i+1}\).
For a given target value \(t_j\), the function finds all values \(s_{j, k}\), such that \(f(s_{j, k}) = t_j\). Because the number of solutions can vary for different \(t_j\), the return type for \(s\) is a list of the same length as \(t\) with each entry an array.
If the equation \(f(s) = t_j\) has no solution, we return the closest point \(s_j\), i.e.,
\[\|f(s_j) - t_j\| = \min_z \|f(z) - t_j\|\,.\]We always return only one solution in this case.
- Parameters:
x (ndarray) – Increasing vector of points defining the function.
y (ndarray) – Vector of same length as x defining the function values.
t (ndarray) – Vector of points at which to invert the function.
- Returns:
A list \(s\) of the same length as \(t\) of arrays such that for all \(j\), the array \(s_j\) is strictly increasing and contains all solutions of the equation \(f(s) = t_j\).
- Return type:
List[ndarray]