RegressionMixin
- class pyoptex.analysis.mixins.fit_mixin.RegressionMixin(factors=(), Y2X=<function identityY2X>, random_effects=())[source]
Base mixin for all regressors. This mixin extends the regressor mixin from sklearn. To create your own regressor, do
>>> class MyRegressor(RegressionMixin): >>> def _fit(self, X, y): >>> # Your fit code >>> pass >>> >>> def _predict(self, X): >>> # Optional, if you require a custom prediction >>> # Defaults to >>> return np.sum(X[:, self.terms_] * np.expand_dims(self.coef_, 0), axis=1) >>> * self.y_std_ + self.y_mean_
One function should be implemented: the _fit function which fits your model based on the encoded and normalized X, and normalized y. It should set the parameters specified below. Inside the _fit function, you have access to the attributes specified below.
Optionally, you can implement your own prediction function, however, when setting the coefficients and terms correctly, this should not be necessary. The _predict function receives a normalized and encoded X.
Any attributes suffixed by _ is only accessible after fitting.
Note
Regressor should be able to handle both OLS and mixed models, or raise an error otherwise. Use fit_fn_ attribute to fit a model given some terms and data. It automatically accounts for OLS vs. mixed model.
Note
If you require access to the attributes factors, re or Y2X, use the underscored versions _factors, _re and _Y2X. As sklearn does not permit to adapt these factors directly, they may be adapted during fitting.
Parameters
- terms_np.array(1d)
The indices of the terms (= columns in X) in the model.
- coef_np.array(1d)
An array of coefficients corresponding to the terms.
- scale_float
The scale (= variance of the fit).
- vcomp_float
The estimates of any presented variance components.
- fit_optional
The result of calling
>>> fit_fn_(X, y, self.terms_)
if applicable. If not specified,
summaryis unavailable.
Attributes
- factorslist(
Factor) A list of factors to be used during fitting. It contains the categorical encoding, continuous normalization, etc.
- Y2Xfunc(Y)
The function to transform a design matrix Y to a model matrix X.
- random_effectslist(str)
The names of any random effect columns. Every random effect is interpreted as a string column and encoded using effect encoding.
- n_features_in_int
The number of features. Equals len(self._factors).
- features_names_in_list(str)
The names of the features.
- n_encoded_features_int
The number of encoded features. Is the result of Y2X(Y).shape[1].
- effect_types_np.array(1d)
An array indicating the type of each factor (effect). A 1 indicates a continuous variable, anything higher indicates a categorical factor with that many levels. Can be used for internal package functions such as
encode_model.- coords_
numba.typed.List A list of 2d numpy arrays. Each element corresponds to the possible encodings of a factor. Retrieved using factor.coords_ property.
- y_mean_float
The mean y-value, used in normalization.
- y_std_float
The standard deviation of the y-value, used in normalization.
- fit_fn_func(X, y, terms)
A fit function used to fit a model from data and the specified terms. When random effects are specified, this fits a mixed model, otherwise an OLS is fitted.
- Zs_np.array(2d)
The groups of each random effect. Zs.shape[0] == len(self._re) and Zs.shape[1] == len(X). For example, if the first row is [0, 0, 1, 1], then the first two runs are in group 0 according to the first random effect, and the last two runs are in group 1.
- is_fitted_bool
Whether the regressor has been fitted.
- __init__(factors=(), Y2X=<function identityY2X>, random_effects=())[source]
Creates the regressor
Parameters
- factorslist(
Factor) A list of factors to be used during fitting. It contains the categorical encoding, continuous normalization, etc.
- Y2Xfunc(Y)
The function to transform a design matrix Y to a model matrix X.
- random_effectslist(str)
The names of any random effect columns. Every random effect is interpreted as a string column and encoded using effect encoding.
Methods
RegressionMixin.fit(X, y)Fits the data.
RegressionMixin.formula([labels])Creates the prediction formula of the fit for the encoded and normalized data.
Creates the prediction formula of the fit for the encoded and normalized data.
Prediction variances for the new values specified in X.
Predict on new data after fitting.
Preprocesses before fitting the data.
Preprocessing the incoming data before prediction.
RegressionMixin.score(X, y[, sample_weight])Return the coefficient of determination of the prediction.
Generates a summary of the fit in case it was stored during training in the fit_ attribute.
Attributes
Alias for
information_matrixAlias for
inv_information_matrixAlias for
obs_covAlias for
inv_obs_covThe information matrix of the fitted data.
The inverse of the information matrix.
The inverse of the observation covariance matrix.
Checks whether the regressor has been fitted.
The observation covariance matrix \(V = var(Y)\).
The total variance on the normalized y-values.
- factorslist(