chaospy.GaussianKDE¶

class chaospy.GaussianKDE(samples, h_mat=None, estimator_rule='scott', weights=None, rotation=None)[source]¶

Gaussian kernel density estimator.

Density estimator that handles both univariate and multivariate data. It provides automatic bandwidth selection method using Scott’s and Silverman’s method.

Attributes:

samples:: The raw data as provided by the user reshaped to have ndim == 2.
h_mat:: The covariance matrix to each sample. Assuming uncorrelated dimensions, the bandwidth is the square root of the diagonals. It will have either dimensions (1, n_dim, n_dim) if all samples shares covariance, or (n_samples, n_dim, n_dim) if not.
weights:: How much each sample is weighted. Either a scalar when the samples are equally weighted, or with length n_samples otherwise.

Examples:

>>> samples = [[-1, 0, 1], [0, 1, 2]]
>>> distribution = chaospy.GaussianKDE(samples, estimator_rule="silverman")
>>> distribution.h_mat  # H-matrix or bandwidth**2
array([[[0.38614462, 0.        ],
        [0.        , 0.38614462]]])
>>> uloc = [[0, 0, 1, 1], [0, 1, 0, 1]]
>>> distribution.pdf(uloc).round(4)
array([0.0469, 0.0982, 0.0074, 0.0469])
>>> distribution.fwd(uloc).round(4)
array([[0.5   , 0.5   , 0.8152, 0.8152],
       [0.1233, 0.5   , 0.0142, 0.1532]])
>>> distribution.inv(uloc).round(4)
array([[-6.577 , -6.577 ,  5.3948,  5.3948],
       [-4.3871,  4.6411, -5.9611,  7.7779]])
>>> distribution.mom([(0, 1, 1), (1, 0, 1)]).round(4)
array([1.    , 0.    , 0.6667])

__init__(samples, h_mat=None, estimator_rule='scott', weights=None, rotation=None)¶

Args:

samples (numpy.ndarray):: The samples to generate density estimation. Assumed to have shape either as (n_samples,), or (n_dim, n_samples).
h_mat (Optional[numpy.ndarray]):: The H-matrix, also known as the smoothing matrix or bandwidth matrix. In one dimension it correspond to the square of the bandwidth parameters often used in the one-dimensional case. Assumes shape either something compatible with (n_dim, n_dim), or (n_dim, n_dim, n_samples) in the case where each sample have their own H-matrix. If omitted, it is automatically calculated using estimator_rule.
estimator_rule (str):: Which method to use to select smoothing matrix from, assuming it is omitted. Choose from ‘scott’ and ‘silverman’.
weights (Optional[numpy.ndarray]):: Weights of the samples. This must have the shape (n_samples,). If omitted, each sample is assumed to be equally weighted.

Methods

`pdf`(x_data[, decompose, allow_approx, step_size])	Probability density function.
`cdf`(x_data)	Cumulative distribution function.
`fwd`(x_data)	Forward Rosenblatt transformation.
`inv`(q_data[, max_iterations, tollerance])	Inverse Rosenblatt transformation.
`sample`([size, rule, antithetic, ...])	Create pseudo-random generated samples.
`mom`(K[, allow_approx])	Raw statistical moments.
`ttr`(kloc)	Three terms relation's coefficient generator.

Attributes

`interpret_as_integer`	Flag indicating that return value from the methods sample, and inv should be interpreted as integers instead of floating point.
`lower`	Lower bound for the distribution.
`stochastic_dependent`	True if distribution contains stochastically dependent components.
`upper`	Upper bound for the distribution.