Note

This page is a reference documentation. It only explains the class signature, and not how to use it. Please refer to the What you really need to know section for the big picture.

julearn.models.xgb_cvearlystopping.XGBRegressorCVEarlyStopping

class julearn.models.xgb_cvearlystopping.XGBRegressorCVEarlyStopping(test_size, early_stopping_rounds, **kwargs)

XGBRegressor with cross-validated early stopping.

A wrapper for XGBoost that performs early stopping using a cross-validation split of the data. The model is first trained on a training set with early stopping based on a validation set, and then refit on the full data using the best number of iterations found.

Parameters:
  • test_size (float | int | None) – The proportion of the data to use as the validation set for early stopping. If groups is used on fit, this parameter refers to the number of groups, otherwise it refers to the number of samples. If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number. If None, the value is set to the complement of the train size. If train_size is also None, it will be set to 0.25 in the case of non-grouped data or 0.2 for grouped data (scikit-learn’s defaults for train_test_split and GroupShuffleSplit).

  • early_stopping_rounds (int) – The number of rounds to use for early stopping.

  • **kwargs (Any) – Extra keyword arguments to pass to the XGBRegressor.

__init__(test_size, early_stopping_rounds, **kwargs)
fit(X, y, groups=None)

Fit the model.

Parameters:
  • X (ndarray | DataFrame | Series) – The data to fit the model on.

  • y (ndarray | DataFrame | Series) – The target data.

  • groups (ndarray | DataFrame | Series | None, default: None) – The group labels for the samples used while splitting the dataset into train/test set for early stopping. If None, standard train/test split is used, by default None.

Returns:

The fitted model.

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

A MetadataRequest encapsulating routing information.

get_params(deep=True)

Get the parameters of the model.

Parameters:

deep (bool, default: True) – If True, will return the parameters for this model and contained subobjects that are estimators (default is True).

Returns:

Parameter names mapped to their values.

predict(X)

Predict using the model.

Parameters:

X (ndarray | DataFrame | Series) – The data to predict on.

Returns:

The predictions.

score(X, y, sample_weight=None)

Return coefficient of determination on test data.

The coefficient of determination, \(R^2\), is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares ((y_true - y_pred)** 2).sum() and \(v\) is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.

  • sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

Returns:

\(R^2\) of self.predict(X) w.r.t. y.

Notes

The \(R^2\) score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of r2_score(). This influences the score method of all the multioutput regressors (except for MultiOutputRegressor).

set_fit_request(*, groups: bool | None | str = '$UNCHANGED$') XGBRegressorCVEarlyStopping

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

groups (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for groups parameter in fit.

Returns:

The updated object.

set_params(**params)

Set the parameters of the model.

Parameters:

**params (Any) – Estimator parameters.

Returns:

The model with updated parameters.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') XGBRegressorCVEarlyStopping

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

The updated object.