TransformerMixin

class pyoptex.analysis.mixins.fit_mixin.TransformerMixin(factors=(), Y2X=<function identityY2X>, random_effects=())[source]

Base mixin for all transformers. This mixin extends the transformer mixin from sklearn. To create your own transformer, do

>>> class MyTransformer(TransformerMixin):
>>>     def _fit(self, X, y):
>>>         # Your fit code
>>>         pass
>>>
>>>     def _apply_transform(self, X, y):
>>>         # Your transform code to transform X and y
>>>         return X, y

You should implement two functions: the _fit function which fits the transformer to the data (given the encoded and normalized X, and normalized y), and the _apply_transform function which applies the transformation to the data.

Any attributes suffixed by _ is only accessible after fitting.

Note

Transformers should be able to handle both OLS and mixed models, or raise an error otherwise. Use fit_fn_ attribute to fit a model given some terms and data. It automatically accounts for OLS vs. mixed model.

Note

If you require access to the attributes factors, re or Y2X, use the underscored versions _factors, _re and _Y2X. As sklearn does not permit to adapt these factors directly, they may be adapted during fitting.

Attributes

factorslist(Factor)

A list of factors to be used during fitting. It contains the categorical encoding, continuous normalization, etc.

Y2Xfunc(Y)

The function to transform a design matrix Y to a model matrix X.

random_effectslist(str)

The names of any random effect columns. Every random effect is interpreted as a string column and encoded using effect encoding.

n_features_in_int

The number of features. Equals len(self._factors).

features_names_in_list(str)

The names of the features.

n_encoded_features_int

The number of encoded features. Is the result of Y2X(Y).shape[1].

effect_types_np.array(1d)

An array indicating the type of each factor (effect). A 1 indicates a continuous variable, anything higher indicates a categorical factor with that many levels. Can be used for internal package functions such as encode_model.

coords_list

A list of 2d numpy arrays. Each element corresponds to the possible encodings of a factor. Retrieved using factor.coords_ property.

y_mean_float

The mean y-value, used in normalization.

y_std_float

The standard deviation of the y-value, used in normalization.

fit_fn_func(X, y, terms)

A fit function used to fit a model from data and the specified terms. When random effects are specified, this fits a mixed model, otherwise an OLS is fitted.

Zs_np.array(2d)

The groups of each random effect. Zs.shape[0] == len(self._re) and Zs.shape[1] == len(X). For example, if the first row is [0, 0, 1, 1], then the first two runs are in group 0 according to the first random effect, and the last two runs are in group 1.

is_fitted_bool

Whether the transformer has been fitted.

__init__(factors=(), Y2X=<function identityY2X>, random_effects=())[source]

Creates the regressor

Parameters

factorslist(Factor)

A list of factors to be used during fitting. It contains the categorical encoding, continuous normalization, etc.

Y2Xfunc(Y)

The function to transform a design matrix Y to a model matrix X.

random_effectslist(str)

The names of any random effect columns. Every random effect is interpreted as a string column and encoded using effect encoding.

Methods

TransformerMixin.fit(X, y)

Fits the data.

TransformerMixin.fit_transform(X, y)

Fit the transformer to the data and apply the transformation.

TransformerMixin.preprocess_fit(X, y)

Preprocesses before fitting the data.

TransformerMixin.set_output(*[, transform])

Set output container.

TransformerMixin.transform(X, y)

Apply the transformation to the data.

Attributes

TransformerMixin.is_fitted

Checks whether the regressor has been fitted.