entropies_approx
- pyoptex.analysis.estimators.sams.entropy.entropies_approx(submodels, freqs, model_size, dep, mode, forced=None, N=10000, sampler=<function sample_model_dep_onebyone>, eps=1e-06)[source]
Compute the approximate entropy by sampling N random models and observing the frequency of each submodel.
The entropy is computed as
where \(f_{o}\) is the observed frequency of the submodel in the SAMS procedure and \(f_{t}\) is the theoretical frequency when sampling at random. A higher entropy indicates more “surprise” and therefore more likely to be the correct model.
Parameters
- submodelslist(np.array(1d))
The list of top submodels for each size.
- freqsnp.array(1d)
The frequencies of these submodels in the raster plot.
- model_sizeint
The size of the overfitted models. The overfitted model includes the forced model, and its size must thus be larger than the forced model.
- depnp.array(2d)
The dependency matrix of size (N, N) with N the number of terms in the encoded model (output from Y2X). Term i depends on term j if dep(i, j) = true.
- modeNone or ‘weak’ or ‘strong’
The heredity mode during sampling.
- forcedNone or np.array(1d)
Any terms that must be included in the model.
- Nint
The number of random samples to draw to compute the theoretical frequency of a submodel.
- samplerfunc(dep, model_size, N, forced, mode)
The sampler to use when generating random hereditary models.
- epsfloat
A numerical stability parameter in computing the entropy.
Returns
- entropynp.array(1d)
An array of floats of the same length as the submodels.