7. Overview of available Pipeline Steps#

The following is a list of all available steps that can be used to create a pipeline by name. The overview is sorted based on the type of the step: Transformers or Models (Estimators).

  • The column Name refers to the string-name of the respective step, i.e. how it should be specified when passed to e.g., the PipelineCreator.

  • The column Description gives a short description of what the step is doing.

  • The column Class either indicates the underlying scikit-learn class of the respective pipeline step together with a link to the class in the scikit-learn documentation (follow the link to see the valid parameters) or indicates the class in julearn, so one can have a closer look at it in julearn’s API Reference.

For feature transformations, the Transformers are to be used with the PipelineCreator and for target transformations, the Transformers are to be used with the TargetPipelineCreator.

7.1. Transformers#

Scalers#

Name

Description

Class

zscore

Removing mean and scale to unit variance

StandardScaler

scaler_robust

Removing median and scale to IQR

RobustScaler

scaler_minmax

Scale to a given range

MinMaxScaler

scaler_maxabs

Scale by max absolute value

MaxAbsScaler

scaler_normalizer

Normalize to unit norm

Normalizer

scaler_quantile

Transform to uniform or normal distribution (robust)

QuantileTransformer

scaler_power

Gaussianise data

PowerTransformer

Feature Selection#

Name

Description

Class

select_univariate

Removing mean and scale to unit variance

GenericUnivariateSelect

select_percentile

Rank and select percentile

SelectPercentile

select_k

Rank and select K

SelectKBest

select_fdr

Select based on estimated FDR

SelectFdr

select_fpr

Select based on FPR threshold

SelectFpr

select_fwe

Select based on FWE threshold

SelectFwe

select_variance

Remove low variance features

VarianceThreshold

DataFrame operations#

Name

Description

Class

confound_removal

Removing confounds from features,
by subtracting the prediction of each feature given all confounds.
By default this is equal to “independently regressing out
the confounds from the features”

ConfoundRemover

drop_columns

Drop columns from the DataFrame

DropColumns

change_column_types

Change the type of a column in a DataFrame

ChangeColumnTypes

filter_columns

Filter columns in a DataFrame

FilterColumns

Decomposition#

Name

Description

Class

pca

Principal Component Analysis

PCA

Custom#

Name

Description

Class

cbpm

Connectome-based Predictive Modeling (CBPM)

CBPM

7.2. Models (Estimators)#

Support Vector Machines#

Name

Description

Class

Binary

Multiclass

Regression

svm

Support Vector Machine

SVC and

Y

Y

Y

Ensemble#

Name

Description

Class

Binary

Multiclass

Regression

rf

Random Forest

Y

Y

Y

et

Extra-Trees

Y

Y

Y

adaboost

AdaBoost

Y

Y

Y

bagging

Bagging

Y

Y

Y

gradientboost

Gradient Boosting

Y

Y

Y

stacking

Stacking

Y

Y

Y

Gaussian Processes#

Name

Description

Class

Binary

Multiclass

Regression

gauss

Gaussian Process

Y

Y

Y

Linear Models#

Name

Description

Class

Binary

Multiclass

Regression

logit

Logistic Regression (aka logit, MaxEnt).

LogisticRegression

Y

Y

N

logitcv

Logistic Regression CV (aka logit, MaxEnt).

LogisticRegressionCV

Y

Y

N

linreg

Least Squares regression.

LinearRegression

N

N

Y

ridge

Linear least squares with l2 regularization.

Y

Y

Y

ridgecv

Ridge regression with built-in cross-validation.

Y

Y

Y

sgd

Linear model fitted by minimizing a regularized empirical loss with SGD

Y

Y

Y

Naive Bayes#

Name

Description

Class

Binary

Multiclass

Regression

nb_bernoulli

Multivariate Bernoulli models.

BernoulliNB

Y

Y

N

nb_categorical

Categorical features.

CategoricalNB

Y

Y

N

nb_complement

Complement Naive Bayes

ComplementNB

Y

Y

N

nb_gaussian

Gaussian Naive Bayes

GaussianNB

Y

Y

N

nb_multinomial

Multinomial models

MultinomialNB

Y

Y

N

Dynamic Selection#

Name

Description

Class

Binary

Multiclass

Regression

ds

Support for DESlib models

DynamicSelection

Y

Y

Y

Dummy#

Name

Description

Class

Binary

Multiclass

Regression

dummy

Use simple rules (without features).

Y

Y

Y