chaospy.GaussianKDE

class chaospy.GaussianKDE(samples, h_mat=None, estimator_rule='scott', weights=None, rotation=None)[source]

Gaussian kernel density estimator.

Density estimator that handles both univariate and multivariate data. It provides automatic bandwidth selection method using Scott’s and Silverman’s method.

Attributes:
samples:

The raw data as provided by the user reshaped to have ndim == 2.

h_mat:

The covariance matrix to each sample. Assuming uncorrelated dimensions, the bandwidth is the square root of the diagonals. It will have either dimensions (1, n_dim, n_dim) if all samples shares covariance, or (n_samples, n_dim, n_dim) if not.

weights:

How much each sample is weighted. Either a scalar when the samples are equally weighted, or with length n_samples otherwise.

Examples:
>>> samples = [[-1, 0, 1], [0, 1, 2]]
>>> distribution = chaospy.GaussianKDE(samples, estimator_rule="silverman")
>>> distribution.h_mat  # H-matrix or bandwidth**2
array([[[0.38614462, 0.        ],
        [0.        , 0.38614462]]])
>>> uloc = [[0, 0, 1, 1], [0, 1, 0, 1]]
>>> distribution.pdf(uloc).round(4)
array([0.0469, 0.0982, 0.0074, 0.0469])
>>> distribution.fwd(uloc).round(4)
array([[0.5   , 0.5   , 0.8152, 0.8152],
       [0.1233, 0.5   , 0.0142, 0.1532]])
>>> distribution.inv(uloc).round(4)
array([[-6.577 , -6.577 ,  5.3948,  5.3948],
       [-4.3871,  4.6411, -5.9611,  7.7779]])
>>> distribution.mom([(0, 1, 1), (1, 0, 1)]).round(4)
array([1.    , 0.    , 0.6667])
__init__(samples, h_mat=None, estimator_rule='scott', weights=None, rotation=None)
Args:
samples (numpy.ndarray):

The samples to generate density estimation. Assumed to have shape either as (n_samples,), or (n_dim, n_samples).

h_mat (Optional[numpy.ndarray]):

The H-matrix, also known as the smoothing matrix or bandwidth matrix. In one dimension it correspond to the square of the bandwidth parameters often used in the one-dimensional case. Assumes shape either something compatible with (n_dim, n_dim), or (n_dim, n_dim, n_samples) in the case where each sample have their own H-matrix. If omitted, it is automatically calculated using estimator_rule.

estimator_rule (str):

Which method to use to select smoothing matrix from, assuming it is omitted. Choose from ‘scott’ and ‘silverman’.

weights (Optional[numpy.ndarray]):

Weights of the samples. This must have the shape (n_samples,). If omitted, each sample is assumed to be equally weighted.

Methods

pdf(x_data[, decompose, allow_approx, step_size])

Probability density function.

cdf(x_data)

Cumulative distribution function.

fwd(x_data)

Forward Rosenblatt transformation.

inv(q_data[, max_iterations, tollerance])

Inverse Rosenblatt transformation.

sample([size, rule, antithetic, ...])

Create pseudo-random generated samples.

mom(K[, allow_approx])

Raw statistical moments.

ttr(kloc)

Three terms relation's coefficient generator.

Attributes

interpret_as_integer

Flag indicating that return value from the methods sample, and inv should be interpreted as integers instead of floating point.

lower

Lower bound for the distribution.

stochastic_dependent

True if distribution contains stochastically dependent components.

upper

Upper bound for the distribution.