SamsBnB
- class pyoptex.analysis.estimators.sams.bnb.sams_bnb.SamsBnB(model_size, models, nterms, mode=None, dependencies=None, forced_model=None)[source]
Runs the BnB algorithm for SAMS automated model selection.
Attributes
- model_sizeint
The size of the overfitted models.
- modelsnp.array(2d)
The returned results from the SAMS simulation. A numpy array with a special datatype where each element contains two arrays of size model_size (‘model’, np.int_), (‘coeff’, np.float64), and one scalar (‘metric’, np.float64).
- ntermsint
The total number of fixed effects in the encoded, normalized model matrix (=X.shape[1] after encoding and normalization). No element in models should be larger than or equal to this value.
- modeNone or ‘weak’ or ‘strong’
The heredity mode during sampling.
- dependenciesnp.array(2d)
The dependency matrix of size (N, N) with N the number of terms in the encoded model (output from Y2X). Term i depends on term j if dep(i, j) = true.
- forced_modelnp.array(1d)
The terms which were forced to be in the simulation models as an integer array. Often the intercept.
- killnp.array(1d)
A boolean array of which terms should not be investigated as they cannot be in the top performing models. Updated during the algorithm.
- spm
scipy.sparse.csc_array A sparse boolean matrix of the models. Has dimensions (models.shape[0], nterms).
- __init__(model_size, models, nterms, mode=None, dependencies=None, forced_model=None)[source]
Initializes the branch-and-bound object.
Parameters
- model_sizeint
The size of the overfitted models.
- modelsnp.array(2d)
The returned results from the SAMS simulation. A numpy array with a special datatype where each element contains two arrays of size model_size (‘model’, np.int_), (‘coeff’, np.float64), and one scalar (‘metric’, np.float64).
- ntermsint
The total number of fixed effects in the encoded, normalized model matrix (=X.shape[1] after encoding and normalization). No element in models should be larger than or equal to this value.
- modeNone or ‘weak’ or ‘strong’
The heredity mode during sampling.
- dependenciesnp.array(2d)
The dependency matrix of size (N, N) with N the number of terms in the encoded model (output from Y2X). Term i depends on term j if dep(i, j) = true.
- forced_modelNone or np.array(1d)
The terms which were forced to be in the simulation models as an integer array. Often the intercept.
Methods
SamsBnB.branches(node)Generates branches by adding possible where permitted.
SamsBnB.init_queue(top_results, top_scores)Initializes the branches queue, starting from the forced model and yielding all possible one-term extensions.
SamsBnB.initialize(nfit)Initializes the results using a greedy search.
SamsBnB.leaf(node)Checks whether the model is of full size.
SamsBnB.loop(top_results, top_scores)Loops through the branch-and-bound algorithm keeping topn results.
SamsBnB.node_in_results(node, results)Check whether the model is already in the results.
SamsBnB.postloop(top_results, top_scores)Callback to run after the branch-and-bound algorithm has run.
SamsBnB.postnew(old, new, top)Kills any terms which do not occur frequently enough.
SamsBnB.preloop(top_results, top_scores)Kills any terms which do not occur frequently enough.
SamsBnB.prenew(old, new, top)Function defining what to do after finding a new optimal node and before adding it to the top.
SamsBnB.top(nfit)Returns the top nfit results using the branch-and-bound algorithm.
SamsBnB.upperbound(node)Compute the upperbound on the amount of times this submodel occurs in the set (=frequency of this submodel).