Note
This page is a reference documentation. It only explains the class signature, and not how to use it. Please refer to the What you really need to know section for the big picture.
julearn.transformers.JuColumnTransformer#
- class julearn.transformers.JuColumnTransformer(name, transformer, apply_to, needed_types=None, row_select_col_type=None, row_select_vals=None, **params)#
Column transformer that can be used in a Junifer pipeline.
This column transformer is a wrapper around the sklearn column transformer, so it can be used directly with Junifer pipelines.
- Parameters:
- namestr
Name of the transformer.
- transformerEstimatorLike
The transformer to apply to the columns.
- apply_toColumnTypesLike
To which column types the transformer needs to be applied to.
- needed_typesColumnTypesLike, optional
Which feature types are needed for the transformer to work.
- row_select_col_typestr or list of str or set of str or ColumnTypes
The column types needed to select rows (default is None)
- row_select_valsstr, int, bool or list of str, int, bool
The value(s) which should be selected in the row_select_col_type to select the rows used for training (default is None)
- __init__(name, transformer, apply_to, needed_types=None, row_select_col_type=None, row_select_vals=None, **params)#
- transform(X)#
Apply the transformer.
- Parameters:
- Xpd.DataFrame
Data to be transformed.
- Returns:
- outpd.DataFrame
Transformed data.
- get_feature_names_out(input_features=None)#
Get names of features to be returned.
- Parameters:
- input_featuresarray-like of str or None, default=None
If input_features is None, then feature_names_in_ is used as feature names in. If feature_names_in_ is not defined, then the following input feature names are generated: [“x0”, “x1”, …, “x(n_features_in_ - 1)”].
If input_features is an array-like, then input_features must match feature_names_in_ if feature_names_in_ is defined.
- Returns:
- list
Names of features to be kept in the output pd.DataFrame.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
Not used. Kept for compatibility with scikit-learn.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_params(**kwargs)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
sklearn.pipeline.Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- filter_columns(X)#
Get the apply_to columns of a pandas DataFrame.
- Parameters:
- Xpd.DataFrame
The DataFrame to filter.
- Returns:
- pd.DataFrame
The DataFrame with only the apply_to columns.
- fit(X, y=None, **fit_params)#
- fit_transform(X, y=None, **fit_params)#
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Input samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None
Target values (None for unsupervised transformations).
- **fit_paramsdict
Additional fit parameters.
- Returns:
- X_newndarray array of shape (n_samples, n_features_new)
Transformed array.
- get_apply_to()#
Get the column types the estimator applies to.
- Returns:
- ColumnTypes
The column types the estimator applies to.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_needed_types()#
Get the column types needed by the estimator.
- Returns:
- ColumnTypes
The column types needed by the estimator.
- set_output(*, transform=None)#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
None: Transform configuration is unchanged
- Returns:
- selfestimator instance
Estimator instance.