Note

This page is a reference documentation. It only explains the class signature, and not how to use it. Please refer to the What you really need to know section for the big picture.

julearn.transformers.JuColumnTransformer#

class julearn.transformers.JuColumnTransformer(name, transformer, apply_to, needed_types=None, row_select_col_type=None, row_select_vals=None, **params)#

Column transformer that can be used in a julearn pipeline.

This column transformer is a wrapper around the sklearn column transformer, so it can be used directly with julearn pipelines.

Parameters:
namestr

Name of the transformer.

transformerEstimatorLike

The transformer to apply to the columns.

apply_toColumnTypesLike

To which column types the transformer needs to be applied to.

needed_typesColumnTypesLike, optional

Which feature types are needed for the transformer to work.

row_select_col_typestr or list of str or set of str or ColumnTypes

The column types needed to select rows (default is None).

row_select_valsstr, int, bool or list of str, int, bool

The value(s) which should be selected in the row_select_col_type to select the rows used for training (default is None).

**paramsdict

Extra keyword arguments for the transformer.

__init__(name, transformer, apply_to, needed_types=None, row_select_col_type=None, row_select_vals=None, **params)#
transform(X)#

Apply the transformer.

Parameters:
Xpd.DataFrame

Data to be transformed.

Returns:
pd.DataFrame

Transformed data.

get_feature_names_out(input_features=None)#

Get names of features to be returned.

Parameters:
input_featuresarray-like of str or None, default=None

Input features to use.

  • If None, then feature_names_in_ is used as input feature names if it’s defined. If feature_names_in_ is undefined, then the following input feature names are generated: ["x0", "x1", ..., "x(n_features_in_ - 1)"].

  • If array-like, then input_features must match feature_names_in_ if it’s defined.

Returns:
list of str

Names of features to be kept in the output pd.DataFrame.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:
deepbool, default=True

Not used. Kept for compatibility with scikit-learn.

Returns:
dict

Parameter names mapped to their values.

set_params(**kwargs)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as sklearn.pipeline.Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**kwargsdict

Estimator parameters.

Returns:
JuColumnTransformer

JuColumnTransformer instance with params set.

filter_columns(X)#

Get the apply_to columns of a pandas DataFrame.

Parameters:
Xpd.DataFrame

The DataFrame to filter.

Returns:
pd.DataFrame

The DataFrame with only the apply_to columns.

fit(X, y=None, **fit_params)#

Fit the model.

This method will fit the model using only the columns selected by apply_to.

Parameters:
Xpd.DataFrame

The data to fit the model on.

yDataLike, optional

The target data (default is None).

**fit_paramsAny

Additional parameters to pass to the model’s fit method.

Returns:
JuTransformer

The fitted model.

fit_transform(X, y=None, **fit_params)#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_apply_to()#

Get the column types the estimator applies to.

Returns:
ColumnTypes

The column types the estimator applies to.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_needed_types()#

Get the column types needed by the estimator.

Returns:
ColumnTypes

The column types needed by the estimator.

set_output(*, transform=None)#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

New in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.