Note

This page is a reference documentation. It only explains the function signature, and not how to use it. Please refer to the What you really need to know section for the big picture.

julearn.prepare.prepare_input_data

julearn.prepare.prepare_input_data(X, y, df, pos_labels, groups, X_types)

Prepare the input data and variables for the pipeline.

Parameters:
  • X (str | list[str]) – The features to use. See Data for details.

  • y (str) – The targets to predict. See Data for details.

  • df (DataFrame) – See Data for details.

  • pos_labels (str | int | float | list | None) – The labels to interpret as positive. If not None, every element from y will be converted to 1 if is equal or in pos_labels and to 0 if not.

  • groups (str | None) – The grouping labels in case a Group CV is used. See Data for details.

  • X_types (dict | None) – A dictionary containing keys with column type as a str and the columns of this column type as a list of str.

Returns:

  • df_X (pandas.DataFrame) – A dataframe with the features for each sample.

  • df_y (pandas.Series) – A series with the y variable (target) for each sample.

  • df_groups (pandas.Series) – A series with the grouping variable for each sample (if specified in the groups parameter).

Raises:

ValueError – If there is any error on the input data and parameters validation.

Warns:
RuntimeWarning

If the input data and parameters might have inconsistencies.