TransformerMixin
- class pyoptex.analysis.mixins.fit_mixin.TransformerMixin(factors=(), Y2X=<function identityY2X>, random_effects=())[source]
Base mixin for all transformers. This mixin extends the transformer mixin from sklearn. To create your own transformer, do
>>> class MyTransformer(TransformerMixin): >>> def _fit(self, X, y): >>> # Your fit code >>> pass >>> >>> def _apply_transform(self, X, y): >>> # Your transform code to transform X and y >>> return X, y
You should implement two functions: the _fit function which fits the transformer to the data (given the encoded and normalized X, and normalized y), and the _apply_transform function which applies the transformation to the data.
Any attributes suffixed by _ is only accessible after fitting.
Note
Transformers should be able to handle both OLS and mixed models, or raise an error otherwise. Use fit_fn_ attribute to fit a model given some terms and data. It automatically accounts for OLS vs. mixed model.
Note
If you require access to the attributes factors, re or Y2X, use the underscored versions _factors, _re and _Y2X. As sklearn does not permit to adapt these factors directly, they may be adapted during fitting.
Attributes
- factorslist(
Factor) A list of factors to be used during fitting. It contains the categorical encoding, continuous normalization, etc.
- Y2Xfunc(Y)
The function to transform a design matrix Y to a model matrix X.
- random_effectslist(str)
The names of any random effect columns. Every random effect is interpreted as a string column and encoded using effect encoding.
- n_features_in_int
The number of features. Equals len(self._factors).
- features_names_in_list(str)
The names of the features.
- n_encoded_features_int
The number of encoded features. Is the result of Y2X(Y).shape[1].
- effect_types_np.array(1d)
An array indicating the type of each factor (effect). A 1 indicates a continuous variable, anything higher indicates a categorical factor with that many levels. Can be used for internal package functions such as
encode_model.- coords_list
A list of 2d numpy arrays. Each element corresponds to the possible encodings of a factor. Retrieved using factor.coords_ property.
- y_mean_float
The mean y-value, used in normalization.
- y_std_float
The standard deviation of the y-value, used in normalization.
- fit_fn_func(X, y, terms)
A fit function used to fit a model from data and the specified terms. When random effects are specified, this fits a mixed model, otherwise an OLS is fitted.
- Zs_np.array(2d)
The groups of each random effect. Zs.shape[0] == len(self._re) and Zs.shape[1] == len(X). For example, if the first row is [0, 0, 1, 1], then the first two runs are in group 0 according to the first random effect, and the last two runs are in group 1.
- is_fitted_bool
Whether the transformer has been fitted.
- __init__(factors=(), Y2X=<function identityY2X>, random_effects=())[source]
Creates the regressor
Parameters
- factorslist(
Factor) A list of factors to be used during fitting. It contains the categorical encoding, continuous normalization, etc.
- Y2Xfunc(Y)
The function to transform a design matrix Y to a model matrix X.
- random_effectslist(str)
The names of any random effect columns. Every random effect is interpreted as a string column and encoded using effect encoding.
Methods
TransformerMixin.fit(X, y)Fits the data.
Fit the transformer to the data and apply the transformation.
Preprocesses before fitting the data.
TransformerMixin.set_output(*[, transform])Set output container.
Apply the transformation to the data.
Attributes
Checks whether the regressor has been fitted.
- factorslist(
- factorslist(