Note

This page is a reference documentation. It only explains the function signature, and not how to use it. Please refer to the What you really need to know section for the big picture.

julearn.run_fit#

julearn.run_fit(X, y, model, data, X_types=None, problem_type=None, preprocess=None, groups=None, pos_labels=None, model_params=None, search_params=None, seed=None, verbose=0)#

Fit the model on all the data.

Parameters:

Xlist of str

The features to use. See Data for details.

ystr

The targets to predict. See Data for details.

modelstr or scikit-learn compatible model.

If string, it will use one of the available models.

datapandas.DataFrame

DataFrame with the data. See Data for details.

X_typesdict[str, list of str]

A dictionary containing keys with column type as a str and the columns of this column type as a list of str.

problem_typestr

The kind of problem to model.

Options are:

“classification”: Perform a classification in which the target (y) has categorical classes (default). The parameter pos_labels can be used to convert a target with multiple_classes into binary.
“regression”. Perform a regression. The target (y) has to be ordinal at least.

preprocessstr, TransformerLike or list or PipelineCreator | None

Transformer to apply to the features. If string, use one of the available transformers. If list, each element can be a string or scikit-learn compatible transformer. If None (default), no transformation is applied.

See documentation for details.

groupsstr | None

The grouping labels in case a Group CV is used. See Data for details.

pos_labelsstr, int, float or list | None

The labels to interpret as positive. If not None, every element from y will be converted to 1 if is equal or in pos_labels and to 0 if not.

model_paramsdict | None

If not None, this dictionary specifies the model parameters to use

The dictionary can define the following keys:

‘STEP__PARAMETER’: A value (or several) to be used as PARAMETER for STEP in the pipeline. Example: ‘svm__probability’: True will set the parameter ‘probability’ of the ‘svm’ model. If more than option is provided for at least one hyperparameter, a search will be performed.

search_paramsdict | None

Additional parameters in case Hyperparameter Tuning is performed, with the following keys:

‘kind’: The kind of search algorithm to use, Valid options are:
- "grid" : GridSearchCV
- "random" : RandomizedSearchCV
- "bayes" : BayesSearchCV
- "optuna" : OptunaSearchCV
- user-registered searcher name : see register_searcher()
- scikit-learn-compatible searcher
‘cv’: If a searcher is going to be used, the cross-validation
splitting strategy to use. Defaults to same CV as for the model evaluation.
‘scoring’: If a searcher is going to be used, the scoring metric to
evaluate the performance.

See Hyperparameter Tuning for details.

seedint | None

If not None, set the random seed before any operation. Useful for reproducibility.

verbose: int

Verbosity level of outer cross-validation. Follows scikit-learn/joblib converntions. 0 means no additional information is printed. Larger number generally mean more information is printed. Note: verbosity up to 50 will print into standard error, while larger than 50 will print in standrad output.

Returns:

scorespd.DataFrame: The resulting scores (one column for each score specified). Additionally, a ‘fit_time’ column will be added. And, if return_estimator='all' or return_estimator='cv', an ‘estimator’ columns with the corresponding estimators fitted for each CV split.
final_estimatorobject: The final estimator, fitted on all the data (only if return_estimator='all' or return_estimator='final')
inspectorInspector | None: The inspector object (only if return_inspector=True)