Note

This page is a reference documentation. It only explains the class signature, and not how to use it. Please refer to the What you really need to know section for the big picture.

julearn.transformers.JuColumnTransformer#

class julearn.transformers.JuColumnTransformer(name, transformer, apply_to, needed_types=None, row_select_col_type=None, row_select_vals=None, **params)#

Column transformer that can be used in a julearn pipeline.

This column transformer is a wrapper around the sklearn column transformer, so it can be used directly with julearn pipelines.

Parameters:

namestr: Name of the transformer.
transformerEstimatorLike: The transformer to apply to the columns.
apply_toColumnTypesLike: To which column types the transformer needs to be applied to.
needed_typesColumnTypesLike, optional: Which feature types are needed for the transformer to work.
row_select_col_typestr or list of str or set of str or ColumnTypes: The column types needed to select rows (default is None).
row_select_valsstr, int, bool or list of str, int, bool: The value(s) which should be selected in the row_select_col_type to select the rows used for training (default is None).
**paramsdict: Extra keyword arguments for the transformer.

__init__(name, transformer, apply_to, needed_types=None, row_select_col_type=None, row_select_vals=None, **params)#

transform(X)#

Apply the transformer.

Parameters:

Xpd.DataFrame: Data to be transformed.

Returns:

pd.DataFrame: Transformed data.

get_feature_names_out(input_features=None)#

Get names of features to be returned.

Parameters:

input_featuresarray-like of str or None, default=None

Input features to use.

If None, then feature_names_in_ is used as input feature names if it’s defined. If feature_names_in_ is undefined, then the following input feature names are generated: ["x0", "x1", ..., "x(n_features_in_ - 1)"].
If array-like, then input_features must match feature_names_in_ if it’s defined.

Returns:

list of str: Names of features to be kept in the output pd.DataFrame.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deepbool, default=True: Not used. Kept for compatibility with scikit-learn.

Returns:

dict: Parameter names mapped to their values.

set_params(**kwargs)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as sklearn.pipeline.Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**kwargsdict: Estimator parameters.

Returns:

JuColumnTransformer: JuColumnTransformer instance with params set.

filter_columns(X)#

Get the apply_to columns of a pandas DataFrame.

Parameters:

Xpd.DataFrame: The DataFrame to filter.

Returns:

pd.DataFrame: The DataFrame with only the apply_to columns.

fit(X, y=None, **fit_params)#

Fit the model.

This method will fit the model using only the columns selected by apply_to.

Parameters:

Xpd.DataFrame: The data to fit the model on.
yDataLike, optional: The target data (default is None).
**fit_paramsAny: Additional parameters to pass to the model’s fit method.

Returns:

JuTransformer: The fitted model.

fit_transform(X, y=None, **fit_params)#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:

Xarray-like of shape (n_samples, n_features): Input samples.
yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None: Target values (None for unsupervised transformations).
**fit_paramsdict: Additional fit parameters.

Returns:

X_newndarray array of shape (n_samples, n_features_new): Transformed array.

get_apply_to()#

Get the column types the estimator applies to.

Returns:

ColumnTypes: The column types the estimator applies to.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_needed_types()#

Get the column types needed by the estimator.

Returns:

ColumnTypes: The column types needed by the estimator.

set_output(*, transform=None)#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:

transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged

New in version 1.4: “polars” option was added.

Returns:

selfestimator instance: Estimator instance.