4. Overview of available Pipeline Steps#

The following is a list of all available steps that can be used to create a pipeline by name. The overview is sorted based on the type of the step: Transformers or Models (Estimators).

The column ‘Name (str)’ refers to the string-name of the respective step, i.e. how it should be specified when passed to e.g. the PipelineCreator. The column ‘Description’ gives a short description of what the step is doing. The column ‘Class’ either indicates the underlying scikit-learn class of the respective pipeline-step together with a link to the class in the scikit-learn documentation (follow the link to see the valid parameters) or indicates the class in the Julearn code, so one can have a closer look at it in Julearn’s Reference.

For feature transformations the Transformers have to be used with the PipelineCreator and for target transformation with the TargetPipelineCreator.

4.1. Transformers#

Scalers#

Name (str)

Description

Class

zscore

Removing mean and scale to unit variance

StandardScaler

scaler_robust

Removing median and scale to IQR

RobustScaler

scaler_minmax

Scale to a given range

MinMaxScaler

scaler_maxabs

Scale by max absolute value

MaxAbsScaler

scaler_normalizer

Normalize to unit norm

Normalizer

scaler_quantile

Transform to uniform or normal distribution (robust)

QuantileTransformer

scaler_power

Gaussianise data

PowerTransformer

Feature Selection#

Name (str)

Description

Class

select_univariate

Removing mean and scale to unit variance

GenericUnivariateSelect

select_percentile

Rank and select percentile

SelectPercentile

select_k

Rank and select K

SelectKBest

select_fdr

Select based on estimated FDR

SelectFdr

select_fpr

Select based on FPR threshold

SelectFpr

select_fwe

Select based on FWE threshold

SelectFwe

select_variance

Remove low variance features

VarianceThreshold

DataFrame operations#

Name (str)

Description

Class

confound_removal

Removing confounds from features, by subtracting the prediction of each feature given all confounds. By default this is equal to “independently regressing out the confounds from the features”

ConfoundRemover

drop_columns

Drop columns from the dataframe

DropColumns

change_column_types

Change the type of a column in a dataframe

ChangeColumnTypes

filter_columns

Filter columns in a dataframe

FilterColumns

Decomposition#

Name (str)

Description

Class

pca

Principal Component Analysis

PCA

Custom#

Name (str)

Description

Class

cbpm

Connectome-based Predictive Modeling (CBPM)

CBPM

4.2. Models (Estimators)#

Support Vector Machines#

Name (str)

Description

Class

Binary

Multiclass

Regression

svm

Support Vector Machine

SVC and SVR

Y

Y

Y

Ensemble#

Name (str)

Description

Class

Binary

Multiclass

Regression

rf

Random Forest

RandomForestClassifier and RandomForestRegressor

Y

Y

Y

et

Extra-Trees

ExtraTreesClassifier and ExtraTreesRegressor

Y

Y

Y

adaboost

AdaBoost

AdaBoostClassifier and AdaBoostRegressor

Y

Y

Y

bagging

Bagging

BaggingClassifier and BaggingRegressor

Y

Y

Y

gradientboost

Gradient Boosting

GradientBoostingClassifier and GradientBoostingRegressor

Y

Y

Y

stacking

Stacking

StackingClassifier and StackingRegressor

Y

Y

Y

Gaussian Processes#

Name (str)

Description

Class

Binary

Multiclass

Regression

gauss

Gaussian Process

GaussianProcessClassifier and GaussianProcessRegressor

Y

Y

Y

Linear Models#

Name (str)

Description

Class

Binary

Multiclass

Regression

logit

Logistic Regression (aka logit, MaxEnt).

LogisticRegression

Y

Y

N

logitcv

Logistic Regression CV (aka logit, MaxEnt).

LogisticRegressionCV

Y

Y

N

linreg

Least Squares regression.

LinearRegression

N

N

Y

ridge

Linear least squares with l2 regularization.

RidgeClassifier and Ridge

Y

Y

Y

ridgecv

Ridge regression with built-in cross-validation.

RidgeClassifierCV and RidgeCV

Y

Y

Y

sgd

Linear model fitted by minimizing a regularized empirical loss with SGD

SGDClassifier and SGDRegressor

Y

Y

Y

Naive Bayes#

Name (str)

Description

Class

Binary

Multiclass

Regression

nb_bernoulli

Multivariate Bernoulli models.

BernoulliNB

Y

Y

N

nb_categorical

Categorical features.

CategoricalNB

Y

Y

N

nb_complement

Complement Naive Bayes

ComplementNB

Y

Y

N

nb_gaussian

Gaussian Naive Bayes

GaussianNB

Y

Y

N

nb_multinomial

Multinomial models

MultinomialNB

Y

Y

N

Dynamic Selection#

Name (str)

Description

Class

Binary

Multiclass

Regression

ds

Support for DESlib models

DynamicSelection

Y

Y

Y

Dummy#

Name (str)

Description

Class

Binary

Multiclass

Regression

dummy

Use simple rules (without features).

DummyClassifier and DummyRegressor

Y

Y

Y