QuantileOutliersTransformer
- class pyoptex.analysis.transformers.quantile_outlier_transformer.QuantileOutliersTransformer(factors=(), Y2X=<function identityY2X>, random_effects=(), threshold=1, stat='norm')[source]
Transformer using the Quantile-quantile plot method to detect outliers based on a threshold from the ideal value. Drops the terms one-by-one based on the largest deviation as long as the value is above the threshold.
The
fit_transformfunction fits the data and removes the detected outliers. During regular transform, nothing happens as this should only remove training outliers.Note
It is extended by
OutlierTransformerMixin.Attributes
- thresholdfloat
The threshold for dropping terms on the deviation from the quantile line.
- statstr
The distribution to use for the quantile-quantile plot.
- errors_np.ndarray(1d)
The errors (pred - y) for a simple model fit.
- outliers_np.ndarray(1d)
A boolean array marking which rows are considered outliers in the training dataset.
- __init__(factors=(), Y2X=<function identityY2X>, random_effects=(), threshold=1, stat='norm')[source]
Creates the outlier transformer
Parameters
- factorslist(
Factor) A list of factors to be used during fitting. It contains the categorical encoding, continuous normalization, etc.
- Y2Xfunc(Y)
The function to transform a design matrix Y to a model matrix X.
- random_effectslist(str)
The names of any random effect columns. Every random effect is interpreted as a string column and encoded using effect encoding.
- thresholdfloat
The threshold for dropping terms on the deviation from the quantile line.
- statstr
The distribution to use for the quantile-quantile plot.
Methods
Fits the data.
Fit the transformer to the data and apply the transformation.
Preprocesses before fitting the data.
Quantile-quantile outlier detector based on the desired distribution and a threshold value.
QuantileOutliersTransformer.set_output(*[, ...])Set output container.
Ignore any transformation as the outlier detection only applies during training.
Attributes
Checks whether the regressor has been fitted.
- factorslist(